Wikipedia talk:Writing articles with large language models

This is the talk page for discussing improvements to the Writing articles with large language models page.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Shortcut

WT:NEWLLMWT:NEWLLM

Archives: 1

view · edit

Frequently asked questions

Q1: What is the purpose of this guideline?

A1: To establish a ground rule against using AI tools to create articles from scratch.

Q2: This guideline covers so little! What's the point?

A2: The point is to have something. Instead of trying to get consensus for the perfect guideline on AI, which doesn't exist, we have been in practice pursuing a piecemeal approach: restrictions on AI images, AI-generated comments, etc. This is the next step. Eventually, we might merge them all into one to create one guideline on AI use.

Q3: Why doesn't this guideline explain or justify itself?

A3: Guidelines aren't information pages. We have plenty of information already about why using LLMs is usually a bad idea at WP:LLM.

Q4: Why is this guideline only restricted to new articles?

A4: This guideline, originally a proposal, was intentionally designed to be simple and narrow on purpose so consensus could easily be gained and it could become a guideline, with the intent to flesh it out in later discussions.

Why is this guideline only restricted to "new articles"? Shouldn't this apply to all articles? (and talk pages and so on...)

Under my own reading of this rule, it seems like it only applies to new articles, and that pre-existing articles are somehow allowed to have AI-generated text inserted into them. GarethBaloney (talk) 13:46, 24 November 2025 (UTC)[reply]

I think because it's a badly written sentence and was erroneously promoted to Guideline. qcne (talk) 13:48, 24 November 2025 (UTC)[reply]

Well if people are saying it's a badly written guideline then we should make a new discussion on changing it! GarethBaloney (talk) 14:05, 24 November 2025 (UTC)[reply]

Yes! Let's have all our guidelines be padded out with twelve-thousand word essays defending and justifying them and providing supplementary information such that no-one will ever read them and newbies have no freaking idea what it's actually telling them. Cremastra (talk · contribs) 02:05, 25 November 2025 (UTC)[reply]

The guideline and RFC were probably written minimalistically to increase its chances of passing an RFC, with the intent to flesh it out in follow up discussions. –Novem Linguae (talk) 21:26, 24 November 2025 (UTC)[reply]

This one. Cremastra (talk · contribs) 01:08, 25 November 2025 (UTC)[reply]

Further amendment proposal #1: Festucalex

Well, habemus guideline. Now, how is it going to be enforced, given the fact that the guideline is donut-shaped? We might as well address the "from scratch" loophole and preempt the thousands of man-hours that are going to be wasted debating it with LLM users. How should we define "from scratch"? In an ideal situation, the guideline would be this:

This will close the loophole. Any improvements are welcome. 〜 Festucalex • talk 14:05, 24 November 2025 (UTC)[reply]

Strong support. The usage of LLMs to directly edit, add to, or create articles should not be accepted in any way. The high likelihood, and inherent poor quality of sourcing of LLMs makes them ill suited for use on Wikipedia, and genuine human writing and research should be the standard. Stickymatch 02:55, 25 November 2025 (UTC)[reply]

LLM sourcing can be 100% controlled (editor selects sources, uploads them, and explicitly prohibits using anything else). So the poor choice of sources is a human factor, evident in many human-written articles here. Викидим (talk) 05:49, 25 November 2025 (UTC)[reply]

I am not sure if we can amend the proposal after all these !votes have been made, but could you make an exclusion for grammer checkers? Mikeycdiamond (talk) 15:52, 26 November 2025 (UTC)[reply]

These are not !votes. This is an WP:RFCBEFORE discussion. voorts (talk/contributions) 16:33, 26 November 2025 (UTC)[reply]

P.S. I just finished writing an essay against one of the proposed "accepted uses" for LLMs on Wikipedia. I welcome your feedback on the essay's talkpage. User:Festucalex/Don't use LLMs as search engines 〜 Festucalex • talk 16:54, 24 November 2025 (UTC)[reply]

I support this wholeheartedly. GarethBaloney (talk) 14:07, 24 November 2025 (UTC)[reply]

Support. "From scratch" is way too generous. TheBritinator (talk) 14:25, 24 November 2025 (UTC)[reply]

Any policy or guideline that says "ban all uses of LLMs" is bound to get significant opposition. SuperPianoMan9167 (talk) 14:31, 24 November 2025 (UTC)[reply]

And all policies and guidelines have a built-in loophole anyway. SuperPianoMan9167 (talk) 14:36, 24 November 2025 (UTC)[reply]

The fact that WP:IAR exists doesn't mean that we ought to actively introduce crippling loopholes into guidelines. Imagine if we banned vandalism only on new articles, or only on articles that begin with the letter P. 〜 Festucalex • talk 14:57, 24 November 2025 (UTC)[reply]

If you look at the RfC you can see a significant number of users who disagree with the assertion that "all LLM use is bad", which is why I have doubts that a proposal to ban LLMs entirely will ever pass. SuperPianoMan9167 (talk) 15:00, 24 November 2025 (UTC)[reply]

It's WP:NOTVOTE and it should never be. As I said before, anyone who wants to open up uses for LLMs on Wikipedia should explain precisely, minutely, down to the atomic level how and why LLMs can be used on Wikipedia and how these uses are legitimate and minimally disruptive as opposed to all other uses. The case against LLMs has been made practically thousands of times, while the pro-LLM case consists of nothing more than handwaving towards vague say-so assertions and AI company marketing buzzwords. 〜 Festucalex • talk 15:09, 24 November 2025 (UTC)[reply]

WikiProject AI Tools was formed to coordinate legitimate uses of LLMs. SuperPianoMan9167 (talk) 22:31, 24 November 2025 (UTC)[reply]

Also, the rules are principles. The general idea of this guideline is that using LLMs to generate new articles is bad. It is not and should not be a blanket ban on LLMs. LLMs are tools. Like all tools, they have valid use cases but can be misused. Yes, their outputs may be inherently unreliable, but it is incorrect to say they have no use cases. SuperPianoMan9167 (talk) 22:39, 24 November 2025 (UTC)[reply]

Support but with the caveat that I think it's too broad for what this policy has already been approved for. This edit implies any use of LLMs is unacceptable, even if it's not LLM-generated content being included in an article. Given that there's still arguably a carveout for using LLMs to assist with idea generation etc, my Counterproposal if people find it more appealing can be found at #Further amendment proposal #3: Athanelar. Athanelar (talk) 14:43, 24 November 2025 (UTC)[reply]

I think we ought to actively discourage other non-submission uses, even if we can't detect them. At least we'd be making it clear that the community disapproves. This only will stop the honest ones, but hey, that's something. 〜 Festucalex • talk 14:55, 24 November 2025 (UTC)[reply]

I agree, that's why my initial statement is support, I just wanted to present a counterproposal in case the majority would prefer something that doesn't widen the scope so much. Athanelar (talk) 14:59, 24 November 2025 (UTC)[reply]

Can you put the counterproposal in a different section to avoid confusion? NicheSports (talk) 15:10, 24 November 2025 (UTC)[reply]

Done. Athanelar (talk) 15:19, 24 November 2025 (UTC)[reply]

Support. We should probably add clarifying language to this (I have some ready I can propose), but definitely agree and think the community is ready to support a complete LLM ban NicheSports (talk) 15:09, 24 November 2025 (UTC) Now that I understand what is meant by this proposal, I don't support it. I would support a ban on using LLMs to generate article content (per Kowal2701) NicheSports (talk) 00:11, 25 November 2025 (UTC)[reply]

Similar to my comment below, this completely changes the purpose of this guideline (expanding its scope from new articles to all edits) and would require a new RfC. Toadspike [Talk] 15:48, 24 November 2025 (UTC)[reply]

Definitely – I interpreted this as workshopping something that will be brought to another RFC. Is that fine to do here or should we move it to WP:AIC? NicheSports (talk) 15:50, 24 November 2025 (UTC)[reply]

Yes, what we're doing here is the WP:RFCBEFORE that the original proposal never got. There are already 3 wordings on the table: mine, qcne's, and Athanelar's, and I hope this eventually crystallizes (after more refining) into a community-wide RFC. As the closing note pointed out, this issue requires a lot more work and discussion, and a lot of people agreed to Cremastra's proposal because they wanted anything to be instituted to stem the bleeding while the community deliberated on a wider policy. 〜 Festucalex • talk 16:14, 24 November 2025 (UTC)[reply]

Oppose. AI is a tool. For example, I routinely use AI to generate {{cite journal}} templates from loose text (like the references in other publications) or to check my grammar. This is IMHO no more dangerous than using the https://citer.toolforge.org/ for the same purpose (or Grammarly to check the grammar). We should encourage the disclosure, not start an un-enforceable Prohibition. Викидим (talk) 21:29, 24 November 2025 (UTC)[reply]

@Викидим What are your thoughts on my proposal #2, below, which has a specific carve-out for limited LLM use? qcne (talk) 21:31, 24 November 2025 (UTC)[reply]

Does creating the journal template count as generating text for articles? GarethBaloney (talk) 21:44, 24 November 2025 (UTC)[reply]

The sources are certainly part of the text. According to views expressed in the discussion, AI can hallucinate the citation. For the avoidance of doubt, in my opinion – and experience – this is not the case with this use, but then there are many other safe uses of AI – like translation – and all of these IMHO shall be explicitly allowed (yes, I also happen to like m-dashes). Викидим (talk) 22:10, 24 November 2025 (UTC)[reply]

This is IMHO no more dangerous than using [...] I strongly disagree that using the hallucination machine specifically designed to create natural-sounding but not-necessarily-accurate language output is 'no more dangerous' for these purposes than using tools specifically designed for the tasks at hand. Athanelar (talk) 21:54, 24 November 2025 (UTC)[reply]

The AI is not made to manufacture lies any more than a keyboard is. The difference is in performance and intent of the user – these are the ones we might want to address. Blaming tools is IMHO a dead end, Luddites, ostensibly also fighting for quality, quickly lost their battle. Викидим (talk) 22:13, 24 November 2025 (UTC)[reply]

Are unscrupulous editors not more likely to use something like ChatGPT to try and sound professional even when they aren't? Besides, Grammarly is not the same as asking an LLM to generate a Wikipedia article, complete with possibly fake sources. GarethBaloney (talk) 22:59, 24 November 2025 (UTC)[reply]

(1) try and sound professional even when they aren't We are (almost) all amateurs here, so a tool that makes non-professionals sound better is not necessarily bad. (2) The proposal reads should not be used to edit Wikipedia leaving no exceptions for grammar checking. Викидим (talk) 23:23, 24 November 2025 (UTC)[reply]

Grammar checking can done (and has been being done for decades) using non-LLM artificial intelligence models and programs. 〜 Festucalex • talk 23:35, 24 November 2025 (UTC)[reply]

I was going to point this out, haha. There's been automatic grammar checking and spellcheck since what- Word 97? No LLM required. Stickymatch 02:58, 25 November 2025 (UTC)[reply]

All modern translation and grammar checking tools use AI, as it produces superior results. Google for obvious reasons was heavily invested into both for almost 20 years. According to my source, they at first were trying to go the non-AI way (studying and parsing individual grammars, etc.) only to discover than direct mapping between texts does a better job at a lower cost. Everyone else of any importance followed their approach many years ago. It was just not a generic AI that we know now, but an AI nonetheless. Some detail can be found, for example, on p. 19 of the 2008 thesis [1] (there should be better written sources, naturally, but the fact is very well known). Викидим (talk) 06:03, 25 November 2025 (UTC)[reply]

Strong support: removes all ambiguity. Z E T A^C 21:34, 24 November 2025 (UTC)[reply]

Oppose, people often use stuff like Grammarly. The ban needs to be on generating content Kowal2701 (talk) 21:38, 24 November 2025 (UTC)[reply]

Grammarly is not an LLM. 〜 Festucalex • talk 23:34, 24 November 2025 (UTC)[reply]

It's powered by LLMs: In April 2023, Grammarly launched a product using generative AI built on the GPT-3 large language models. (from the article) SuperPianoMan9167 (talk) 23:35, 24 November 2025 (UTC)[reply]

Generative AI tools like Grammarly are powered by a large language model, or LLM - from the Grammarly website [2] GreenLipstickLesbian 💌🧸 23:37, 24 November 2025 (UTC)[reply]

Then users can use a grammar checker other than Grammarly. 〜 Festucalex • talk 23:40, 24 November 2025 (UTC)[reply]

Wow. voorts (talk/contributions) 23:45, 24 November 2025 (UTC)[reply]

I think what users on both sides of this ideological divide are running up against is a common thing that happens whenever there is such a divide between two groups; both groups assume that members of the other group are operating on the same fundamental value system that they are, and that their arguments are built from that same value system.

I.e., the 'less restrictive' party here (voorts, qcne et al) is beginning from the core value that 'the reason LLMs are problematic is that their output is generally not compatible with Wikipedia's standards,' and the argument that stems from that is 'any LLM policy we make should be designed around bringing the result of LLM usage in line with Wikipedia's standards, whether that be directly LLM-generated text, or simply users utilising LLMs in their creative process.'

The 'more restrictive' part here (myself, Festucalex et al) is beginning from the core value that 'LLMs and their output are inherently undesirable and detrimental (for some of us to the internet as a whole, for others perhaps specifically only to Wikipedia)' and the argument that stems from that is 'any LLM policy we make should be designed around minimising the influence of LLMs on the content of Wikipedia.'

That's why Festucalex pivoted here and said people should use something other than Grammarly. We simply believe that it's imperative that we purge LLM output from Wikipedia, regardless of whether it's reviewed or policy compliant or anything else. It's also important to keep in mind that NEWLLM as it stands is a product of the latter ideology, not the former, and I think that's why it appears to be so flawed to people like qcne; because it's solving a completely different problem than the one they're trying to solve. Athanelar (talk) 01:03, 25 November 2025 (UTC)[reply]

I understand your views. What I don't see is evidence. voorts (talk/contributions) 01:11, 25 November 2025 (UTC)[reply]

Exactly. I made an identical point about this fundamental divide in the RfC. (I have discovered I am pivoting more towards the "less restrictive" side in my comments here.) SuperPianoMan9167 (talk) 01:59, 25 November 2025 (UTC)[reply]

Yes, I think people understand the divide is between this idea of fundamentalism (the intrinsic nature of LLMs is that they are bad) and those who don't subscribe to it. But what many of us who oppose this fundamentalism think is that rather than being based on evidence (voorts), it's an article of faith. Katzrockso (talk) 02:56, 25 November 2025 (UTC)[reply]

Not workable – if somebody comes up to me and says "Hey, you've made a mistake in Hanako (elephant)" or shows up on BLP saying "You have my birthdate wrong", then I don't care if they use a LLM to write their post, and I don't care if they use an LLM to translate it from their native language. I'm not even sure I care if they use the LLM to make the edit/explain themselves in the edit summary (but I'd rather they disclose it, for obvious reasons), assuming they do it right.

Ultimately, somebody who repeatedly introduces hoax material/fictitious references to articles repeatedly should be blocked quickly, whether they're using AI or not. Somebody who repeatedly introduces spammy text repeatedly should be blocked, whether they have a COI or not. Somebody who repeatedly introduces unsourced negative BLP information should be blocked, whether or not they're a vandal/have a COI. Somebody who repeatedly inserts copyright violations should be blocked, whether they're acting in good faith or not. The LLM is a red herring – once we've established that the content somebody writes is seriously flawed in a way that's not just accidentally, we need to block the contributor. If they say "but it's not my fault, ChatGPT told me to" then unblocks admins can take that into consideration & we can tban that editor from using automated or semi-automated tools as an unblock condition. GreenLipstickLesbian 💌🧸 23:00, 24 November 2025 (UTC)[reply]

+1 This whole guideline is everyone just sticking their heads in the sand and hoping LLM usage will go away. We should be thinking about how LLMs can be used well, not outright banning their use. voorts (talk/contributions) 23:08, 24 November 2025 (UTC)[reply]

It's also yet another example of why PAGmaking on the fly and without advanced deliberation is a terrible idea. voorts (talk/contributions) 23:10, 24 November 2025 (UTC)[reply]

There are no legitimate uses for LLMs, just like there are no legitimate uses for chemical weapons. They're both technically a tool, and anyone can argue that sarin gas can technically be used against rodents, but is it really worth the risk of having it around the kitchen? 〜 Festucalex • talk 23:46, 24 November 2025 (UTC)[reply]

Are you seriously comparing LLMs to chemical weapons? voorts (talk/contributions) 23:48, 24 November 2025 (UTC)[reply]

Yep. 〜 Festucalex • talk 23:49, 24 November 2025 (UTC)[reply]

65k bytes to get to Godwin's Law, nice! GreenLipstickLesbian 💌🧸 00:03, 25 November 2025 (UTC)[reply]

Festucalex please lol. Also, idk if this is written down anywhere, there's probably an essay, but the fastest way to nuke support for a plausible idea here is to start saying stuff like "X is like sarin gas" NicheSports (talk) 00:06, 25 November 2025 (UTC)[reply]

I think the analogy I'm making is clear: it's a technology whose risks override any potential benefits, at least in this context. Forget sarin gas, let's say it's like a pogo stick in a porcelain museum. 〜 Festucalex • talk 00:09, 25 November 2025 (UTC)[reply]

There are no legitimate uses for LLMs What about this, and this, and this, and this, and this, and this, and this, and...

You get the point. SuperPianoMan9167 (talk) 23:53, 24 November 2025 (UTC)[reply]

There are no legitimate uses of LLMs on Wikipedia. I have said it before and I will say it again. Even if it is impossible to stop all LLM usage, guidelines like this one can serve as a statement of principle. Yours, &c. RGloucester — ☎ 00:00, 25 November 2025 (UTC)[reply]

So everyone in WikiProject AI Tools is editing in bad faith? SuperPianoMan9167 (talk) 00:02, 25 November 2025 (UTC)[reply]

They're using bad tools in good faith because we don't have a comprehensive guideline yet. 〜 Festucalex • talk 00:04, 25 November 2025 (UTC)[reply]

Why can't LLMs ever be legitimately used on Wikipedia? voorts (talk/contributions) 00:06, 25 November 2025 (UTC)[reply]

What is the philosophical mission of Wikipedia? WP:ABOUT begins with the Jimbo quote Imagine a world in which every single person on the planet is given free access to the sum of all human knowledge. That's what we're doing.

LLMs don't produce human knowledge. They produce realistic-sounding human language, because that's what they're designed to do, it's all they've ever been designed to do, it's all they can ever be designed to do – it's literally in their fundamental structure. Not only that, but the output they produce is explicitly biased by their programming and their training data, which are both determined by a private company with no transparency or oversight.

Would you be content if the entirety of Wikipedia's article content were created and maintained by a single editor? Let's assume that single editor is flawless in their work; all of their work is rigorous and meets the standards set by the community (who are still active in a non-article capacity), it's perfectly sourced etc; it's just that it's all coming from a single individual.

What about 90%? 80%? 50%? What percentage of the encyclopedia could be written and managed by a single individual before it would compromise the collaborative nature of Wikipedia?

Thesis 1: The output of an LLM is, effectively, the work of a single individual. Obviously it's more complex than that, but LLM output all has the same tone because it's all the product of the same algorithms from the same privatised training data.

Thesis 2: Given the opportunity, LLM output will comprise an increasingly large percentage of Wikipedia, because it is far faster to copyedit, rewrite and create with LLMs than it is to do so manually. This will only increase the more advanced LLMs get, because their output will require less and less human oversight to comply with Wikipedia's standards.

The conclusion, then, is how much of Wikipedia's total content you're willing to accept being authored by what is essentially a single individual with inscrutable biases and motivations. There must be some cutoff in your mind; and our contention is that if you allow them to get their foot in the door, then the result is going to end up going beyond whatever percentage cutoff you've decided as acceptable. Athanelar (talk) 02:48, 25 November 2025 (UTC)[reply]

"The output of an LLM is, effectively, the work of a single individual. Obviously it's more complex than that" is putting it lightly. Notwithstanding the fact that more than one LLM exists, editors who opposite anti-LLM fundamentalism here have consistently advocated for the necessity of human review and editing when evaluating LLM output. Katzrockso (talk) 02:58, 25 November 2025 (UTC)[reply]

editors who opposite anti-LLM fundamentalism here have consistently advocated for the necessity of human review and editing when evaluating LLM output.

Well, okay, take my initial example again, then. Let's say John Wikipedia is still producing 50 or 80 or 100% or whatever of Wikipedia's output, but it's first being checked by somebody else to make sure it meets standards. Would it now be acceptable that John Wikipedia is the sole author of the majority (or a plurality or simply a large percentage) of Wikipedia's content, simply because his work has been double-checked? Athanelar (talk) 03:09, 25 November 2025 (UTC)[reply]

Yes, if John Wikipedia's contributions all accurately represents the sources as evaluated by other editors and meets our content standards, why would that be a problem? Katzrockso (talk) 03:43, 25 November 2025 (UTC)[reply]

Well, that's just one of those fundamental value differences we'll never overcome, then. I don't think John Wikipedia should be the primary author of content on Wikipedia because that would undermine the point of Wikipedia being a communal project, and for that same reason I don't think we should allow AI-generated content to steadily overtake Wikipedia either, whether or not it's been reviewed or verified or what have you. Athanelar (talk) 03:47, 25 November 2025 (UTC)[reply]

This happens all the time at smaller Wikipedias. There just aren't enough people who speak some languages + can afford to spend hours/days/years editing + actually want to do this for fun to have "a communal project" the way that you're thinking of it. WhatamIdoing (talk) 06:09, 27 November 2025 (UTC)[reply]

What about uses of LLMs that aren't generating new content (which is what most of the tools at WikiProject AI Tools are about)? SuperPianoMan9167 (talk) 03:03, 25 November 2025 (UTC)[reply]

I don't have any issue with that, because it's functionally impossible to identify and police. That's why my proposal is worded differently to Festucalex's, because I think it's only sensible and possible to prohibit the inclusion of AI-generated text, not the use of AI in one's editing process at all. Athanelar (talk) 03:11, 25 November 2025 (UTC)[reply]

I asked why they can't ever be used. I have several FAs and GAs, but I'm terrible at spelling. If, as seems to be the direction the world is heading, most browsers replaced their original spellcheckers with LLM-powered ones, are you suggesting I'd need to install an obscure browser created by anti-AI people to avoid running afoul of this proposed dogmatism? voorts (talk/contributions) 13:38, 25 November 2025 (UTC)[reply]

No, my proposal is to ban adding AI-generated content to Wikipedia, not to ban people using AI as part of their human editing workflow, that would be unenforceable. Athanelar (talk) 14:13, 25 November 2025 (UTC)[reply]

None of these are legitimate, and I hope that our new guideline puts an end to them before they become standard practice. No use designing and marketing kitchen canisters for sarin gas. 〜 Festucalex • talk 00:02, 25 November 2025 (UTC)[reply]

This reads more like a moral panic than a logically & evidentially supported proposal Katzrockso (talk) 00:52, 25 November 2025 (UTC)[reply]

It's not a moral issue. LLMs undermine the whole foundation of this project. They were developed by companies that are in direct competition with Wikipedia. These companies have used our content with the aim of monetarising it through LLM chatbots, and now plot to replace Wikipedia altogether, à la Grokipedia. Promoting LLM use will rot the project from within, and ultimately result in its collapse. Yours, &c. RGloucester — ☎ 06:12, 25 November 2025 (UTC)[reply]

Slippery slope Katzrockso (talk) 14:09, 25 November 2025 (UTC)[reply]

Yes, it is a 'slippery slope' argument, if anything, a better term is 'death by a thousand cuts'. It is a common misconception that a slippery slope argument is an inherent fallacy. I find it very interesting that some editors here prefer to place emphasis on the quality of the content produced, rather than on the actual mission of the project. Let us take this kind of argument to its logical conclusion. If some form of LLM were to advance, and were able to produce content of equivalent quality to the best Wikipedia editors, would we wind up the project, our mission complete? I'd like to hope that the answer would be no, because Wikipedia is meant to be a free encyclopaedia that any human can edit.

When one outsources some function to these 'tools', whether it be spellchecking or article writing, it will inevitably result in the decline of one's own copyediting and writing skills. As our editors lose the skills they have gained by working on this encyclopaedia over these past two decades, they will become more and more reliant on the LLMs. What happens then, when the corporations that own these LLMs decide to cease providing their 'tools' to the masses gratis? Editors, with their own skills weakened, will become helpless. Perhaps only those with the ability to pay to access LLMs will be able to produce content that meets new quality standards that have shifted to align with LLM output. Wikipedia's quality will decline as the pool of skilled editors dwindles, and our audience will shift toward alternatives, like the LLMs themselves. The whole mission of the project will be called into question, as Wikipedia loses its competitive advantage in the marketplace of knowledge. Yours, &c. RGloucester — ☎ 00:20, 26 November 2025 (UTC)[reply]

But we shouldn't sacrifice newcomers in the name of preserving the project by blocking them for using LLMs right after they join when they have no clue why or how LLMs are unreliable. SuperPianoMan9167 (talk) 00:25, 26 November 2025 (UTC)[reply]

My hope for this guideline is that it will prevent that kind of blocking, since good faith newcomers who show up using LLMs will get reverted and linked to this page, instead of the previous situation where they get asked politely to stop, then when they don't, they eventually get dragged to ANI and TBanned from using LLMs, which is frustrating and much more difficult to understand than a simple page that says "Wikipedia doesn't accept LLM-generated articles because that's one of the things that makes Wikipedia different from Grokipedia". -- LWG ^talk 00:57, 26 November 2025 (UTC)[reply]

Assuming we adopt this proposal, and assuming that good faith newcomers abide, there will still be editors who get asked politely to stop (i.e., they will be warned), then when they don't, they eventually [will] get dragged to ANI and blocked, not TBANNED (by my count, only 3 editors are topic banned from LLM use per Wikipedia:Editing restrictions). I've blocked/revoked TPA of many accounts for repeated LLM use and I can assure you that almost none of those editors knew or cared about what any of our guidelines said. In no universe would a no-LLM rule result in any change to the process of having to drag people to ANI to get them blocked. voorts (talk/contributions) 01:11, 26 November 2025 (UTC)[reply]

^this.

To use a real example, every single time anybody makes a post, they agree not to copy paste content from other sites, attribute it if they copy from within Wikipedia, and there are sooooooooooooooooooooooo many copyright blocks given out every year. Most of these people unambigiously acted in good faith. And each and every one got dragged to a noticeboard, often multiple times, before they were blocked. I'm sorry, but this won't be any different - and Wikipedia naturally draws the type of people who like to ask "why", so we're still going to have to point them to WP:LLM and won't be swayed by a simple page saying "no, because I said so".GreenLipstickLesbian 💌🧸 08:05, 26 November 2025 (UTC)[reply]

SuperPianoMan, I agree with you, and I also agree with LWG. The problem until now was that Wikipedia has failed to clearly explain its stance on LLMs, blocking myriad editors without any obvious policy or guideline-based rationale. This ad hoc form of justice has gone on too long, and is unfair to newcomers, and is one reason why I supported the adoption of this guideline, despite its shortcomings. The community needs to clearly explain Wikipedia's purpose, and why LLMs are not suited for use on Wikipedia, to both new editors and our readership. Wikipedia should aim to promote the value of a project that is free, that anyone can edit, and that is made by independent men and women from right across the world. If anything, our position as a human encyclopaedia should be a merit in a competitive information marketplace. Yours, &c. RGloucester — ☎ 01:11, 26 November 2025 (UTC)[reply]

they eventually get dragged to ANI and TBanned from using LLMs, which is frustrating and much more difficult to understand than a simple page

Yes exactly. People were regularly being sanctioned for a rule that they could not have known about it because no such rule existed. Even if not a single newbie ends up reading this guideline, its existence is still beneficial, because it means we are no longer punishing people for breaking unwritten rules. Gnomingstuff (talk) 09:50, 26 November 2025 (UTC)[reply]

I don't think it's ever been practice to sanction somebody for just AI use, though? It's always been fictitious references, violating mass create, copyright issues, WP:V failures, UPE/COI, NPOV violations, ect. I'm not saying no admin has ever blocked a user for only using LLMs, (admins do act outside of policy, sometimes!) though I'd be interested to see any examples. Thanks, GreenLipstickLesbian 💌🧸 10:23, 26 November 2025 (UTC)[reply]

Usually it's more than just AI use if it ends up at ANI but I doubt the distinction is really getting through to people, and a lot of !votes to block, CBAN, TBAN, etc. are made with the rationale of "AI has no place on Wikipedia ever." Sometimes the bulk of the thing is that (example: Wikipedia:Administrators'_noticeboard/IncidentArchive1185#User:_BishalNepal323)

There's also the uw-ai to uw-ai4 series of templates, which implies a four-strikes rule; I don't use them but others do. Gnomingstuff (talk) 10:51, 26 November 2025 (UTC)[reply]

In your example, Ivanvector blocked for disruptive editing, not solely for AI use. voorts (talk/contributions) 14:08, 26 November 2025 (UTC)[reply]

What are we arguing about here? Obviously people are getting blocked for LLM misuse, not LLM use. And I agree with Gnomingstuff and LWG etc. I believe in AGF and have dozens of examples of editors who have stopped using LLMs after I alert them to the difficulty of using them in compliance with content policies. NicheSports (talk) 14:22, 26 November 2025 (UTC)[reply]

We're arguing about the assertion that we need a no AI rule because we've been blocking people solely for AI use without any attendant disruption. That is not true and therefore not a good reason to impose a no AI rule. voorts (talk/contributions) 14:23, 26 November 2025 (UTC)[reply]

To be more clear, when I said "the bulk of the thing" I meant the tenor of the responses in an average ANI posting. Several regulars at ANI generally seem to be under the impression that we do not allow AI, so most !votes are going to have largely unchallenged comments like CIR block now. This LLM shit needs to be stopped by any means necessary. or LLM use should warrant an immediate block, only lifted when a user can demonstrate a clear understanding that they can't use LLMs in any situation. Or if someone gets hit with a uw-ai2, they are told Please refrain from making edits generated using a large language model (an "AI chatbot" or other application using such technology) to Wikipedia pages. Gnomingstuff (talk) 00:39, 27 November 2025 (UTC)[reply]

People say a lot of incorrect things at ANI. We don't usually amend the PAGs to accommodate those people. voorts (talk/contributions) 01:05, 27 November 2025 (UTC)[reply]

On the contrary, that's exactly what we do. PAGs are meant to reflect the actual practice of editors. The process of updating old PAGs or creating new ones to reflect changes in editorial practice is the foundation that has built all of our policies and guidelines. Yours, &c. RGloucester — ☎ 03:10, 27 November 2025 (UTC)[reply]

We're not tho. Nobody as far as I can tell has ever been blocked solely for using AI/LLMs. This is a red herring. voorts (talk/contributions) 13:54, 26 November 2025 (UTC)[reply]

If tomorrow, a LLM came out that could produce a FA-quality article on a given topic in 2 minutes, would you still suggest that LLMs have no place on Wikipedia?

Histrionic comparisons about scenarios that won't happen go both ways. Katzrockso (talk) 07:59, 27 November 2025 (UTC)[reply]

Yes, I would, because using such a technology to produce articles is contrary to the purpose and mission of Wikipedia. Wikipedia's defining principles are that it is free, that any human can edit it, and that its content is produced collaboratively by divers volunteers. Others and I have already explained why machine-produced content contravenes these principles. I care less whether an article is 'FA-quality', whatever that means, and more about how it was made. Yours, &c. RGloucester — ☎ 08:46, 27 November 2025 (UTC)[reply]

Wikipedia's defining principles are that it is free, that any human can edit it, and that its content is produced collaboratively by divers volunteers. Others and I have already explained why machine-produced content contravenes these principles. I am certainly not a fan of LLMs for generating content. However, I don't see how you can say that a human editor, who chooses to use an LLM to generate some content, checks the content to make sure that it accurately reflects its sources and is otherwise PAG-compliant, and finally adds the sourced content to an article contravenes these principles. Wikipedia is no less free, any human can still edit it, and divers volunteers are still able to collaboratively work on the article. Even though that particular content happend to have been produced by a machine. Cheers, SunloungerFrog (talk) 09:12, 27 November 2025 (UTC)[reply]

Yes, in this hypothetical thought experiment. We don't live in a thought experiment. LLM output is getting better in that it is less obviously bad, but the nature of this kind of text generation means it is not suited well and may never be suited well to producing verifiable nonfiction articles. Gnomingstuff (talk) 14:18, 27 November 2025 (UTC)[reply]

Why? What's wrong with an LLM spellchecker other than that you don't like it? voorts (talk/contributions) 13:47, 25 November 2025 (UTC)[reply]

+1 Even the autocorrect on my iPhone uses a transformer, which is the same kind of neural network as that which powers LLMs. The major difference is in size (they're called large language models for a reason). SuperPianoMan9167 (talk) 14:19, 25 November 2025 (UTC)[reply]

Support. This guideline is a good start and I am glad it was approved but it should be expanded.LLMS are not an acceptable way to edit wiki as they cause lots of issues like hallucinations.Changing to oppose as I just realised this goes beyond creating content and would include thongs like grammerly .GothicGolem29 (Talk) 18:35, 28 November 2025 (UTC)[reply]
@GothicGolem29: The Grammarly thing isn't necessarily included. As long as it doesn't generate its own output, it's not really a large language model, even if it claims to use one. The important thing here is that de novo output doesn't make it to the encyclopedia. 〜 Festucalex • talk 23:46, 3 December 2025 (UTC)[reply]

Further amendment proposal #2: qcne

Why the current version of the guideline is bad: A single sentence that clunkily prohibits all LLM use on new articles. How do we define that? Does "from scratch" cover the lead section only? the whole article? a stub? a list? Dunno! It doesn't bother to say! This is banning a method without actually defining where it begins or ends. Since no one can reliably tell if an LLM was used, enforcement would be impossible. LLM detection is unreliable, and we already have CSD G15 to handle unreviewed LLM slop.

I wrote this up a while ago and am now posting it for community consensus. I did just replace the Guideline with my version, but was sadly reverted.

Version 1

See version 3 posted below

My suggested, much more comprehensive Guideline.

== Purpose and scope ==

This guideline describes how editors may and may not use large language models (LLMs, also known as AI chatbots) when editing Wikipedia. It applies to all content generated by an LLM regardless of model or vendor.

Editors remain fully responsible for all edits they make, including LLM-assisted edits. All edits must comply with existing Wikipedia policies.

== Do not use an LLM to write an article from scratch ==

Large language models should not be used to generate new Wikipedia articles from scratch.

Do not paste raw or lightly edited LLM output into a new article, or into a draft intended to become an article.
Do not use LLMs to create the bulk of the prose of an article, even if you intend to fact-check it later.

Where an article, draft, or prose is largely or entirely based on unedited or lightly edited LLM output, it may be draftified, nominated for deletion, or removed, especially where the content is unverifiable, fabricated, or otherwise seriously non-compliant.

== Why LLM-written encyclopaedia content is problematic ==

LLMs are language generation tools. They generate plausible-sounding text, not verified knowledge. This leads to several recurring problems that conflict with Wikipedia policies.

=== Unverifiable content, hallucinations, and original research ===

LLMs frequently:

State claims that are not supported by any published source.
Invent references, including plausible-looking but non-existent citations, books, articles, and URLs.
Combine material from different sources into new conclusions.

LLM-generated prose is not acceptable unless the editor has independently verified every statement and every citation against real sources.

=== Bias, BLP, and non-neutral point of view ===

LLMs reflect the biases of their training data and prompts. They may:

Omit important viewpoints or give undue weight to others.
Phrase contentious material in a way that appears neutral but is not.
Repeat, amplify, or invent serious unsourced allegations about living people.

Unsourced or poorly sourced contentious material about living persons must not be added and should be removed immediately, regardless of whether it was generated by an LLM.

=== Copyright and licensing ===

LLMs can reproduce or closely paraphrase copyrighted material from their training data, including books, paywalled journalism, and other non-free content. Their outputs may therefore:

Contain verbatim or near-verbatim copying that is not compatible with Wikipedia's licenses.
Be derivative works whose copyright status is unclear or incompatible with CC BY-SA and the GNU Free Documentation License.

Editors must not add material that infringes copyright, whether written by themselves, copied from a source, or generated by an LLM. Where suspected copyright violations occur, normal copyright enforcement and deletion processes apply.

== Limited acceptable uses ==

LLMs are not forbidden. However, their use should be limited, and editors must already be competent at the task in question without LLM assistance.

Appropriate, low-risk uses might include:

Copyediting and style: Suggesting wording, tightening prose, or correcting grammar in text that the editor understands and has written or verified from sources.
Outlines and brainstorming: Suggesting article structures, lists of subtopics, or questions to investigate, which the editor then researches using independent reliable sources.
Technical assistance: Helping format wikitext, tables, or templates where the editor can verify correctness.

In all such cases:

The editor must check every factual statement against suitable reliable sources.
The editor must ensure compliance with existing Wikipedia policies

If the editor cannot confidently check and fix the output, they should not use an LLM for that task. LLMs should not be used for tasks in which the editor has little or no independent experience.

Repeatedly making problematic LLM-assisted edits may be treated as a competence issue and can result in being blocked.

== Communicating with LLMs and on-wiki ==

Editors should not use LLMs to generate substantive on-wiki comments, arguments, or !votes in discussions.

Wikipedia's consensus-building process depends on editors expressing their own views, based on their understanding of the issue and of Wikipedia's policies and guidelines. Comments that do not reflect an editor's own reasoning are not helpful and may be considered disruptive. Obvious LLM-generated comments may be collapsed or removed in line with existing policies.

Using an LLM to help with phrasing or grammar of a comment is permissible, provided the ideas and reasoning are the editor's own and the editor reviews the text before posting.

== Disclosure and responsibility ==

Editors should disclose significant LLM assistance in the edit summary (e.g. "copyedited with the help of ChatGPT 5.1 Thinking"). This helps other editors understand and review the edit.

Regardless of disclosure:

Editors are wholly responsible for the content they add or change, including LLM-assisted text.
"The AI wrote it" is not a defence for violations of policy

== Handling existing LLM-generated content ==

Where content appears to be substantially or wholly LLM-generated and does not comply with Wikipedia policy, editors may:

Remove the problematic material outright, especially in biographies of living persons.
Replace it with a sourced, policy-compliant stub or summary.
Tag the page as LLM generated under Template:AI-generated.
Draftify, stubify, or nominate for deletion under the usual processes.
Mark the page for speedy deletion under criteria G15.</nowiki>

Happy for feedback, but we really do need to do something quickly to fix the current version of this new Guideline. qcne (talk) 14:27, 24 November 2025 (UTC)[reply]

While I admire your tenacity and what seems to be a well-written proposal, you can't seriously have expected a whole very long guideline to be implemented with no discussion immediately following the closure of a RfC establishing a short one. Katzrockso (talk) 14:33, 24 November 2025 (UTC)[reply]

Yeah, I genuinely thought sense would prevail and the RFC would be closed as unsuccessful. qcne (talk) 14:36, 24 November 2025 (UTC)[reply]

I think the only real consensus in the RFC above was that it is time for the AI-maximalists to WP:DROPTHESTICK and cease stonewalling every AI discussion. I hope we can eventually implement something similar to what you have proposed. -- LWG ^talk 17:14, 24 November 2025 (UTC)[reply]

Support fully. While I agree that a discussion is necessary, having a guideline this important is so short is unacceptable. The proposal looks like an improvement in every way. TheBritinator (talk) 14:44, 24 November 2025 (UTC)[reply]

While I think this is generally a good expansion of the guideline, I would axe the 'limited acceptable uses' section. I worry that any explicit carveout that 'LLMs are fine if you use them right' is just going to encourage people to use them because they think they can use them right. It's better if the guideline sticks to saying what isn't acceptable (which I think should include any amount of identifiable AI-generated text in an article) and then we can let skilled and savvy people work around that on their own if they dare. Athanelar (talk) 14:49, 24 November 2025 (UTC)[reply]

I would partially agree, perhaps reduce it to a single paragraph of:

LLMs are not forbidden. However, their use should be limited, and editors must already be competent at the task in question without LLM assistance. The editor must check every factual statement against suitable reliable sources and ensure compliance with existing Wikipedia policies. If the editor cannot confidently check and fix the output, they should not use an LLM for that task. LLMs should not be used for tasks in which the editor has little or no independent experience. Editors are wholly responsible for the content they add or change, including LLM-assisted text. "The AI wrote it" is not a defence for violations of policy. qcne (talk) 15:04, 24 November 2025 (UTC)[reply]

I say that "comprehensive" in this case means a shorter, not longer guideline. The attack surface must be minimized. I say this because we're not facing a small ragtag team of vandals, we're facing the single largest thing in the world economy and trying to protect the encyclopedia against it. 〜 Festucalex • talk 15:12, 24 November 2025 (UTC)[reply]

But there's a point (and the current Guideline is it) where a short guideline is so short it is useless. qcne (talk) 15:17, 24 November 2025 (UTC)[reply]

I think keeping it simple and straightforward is the best way to avoid introducing new loopholes and new headaches. 〜 Festucalex • talk 15:22, 24 November 2025 (UTC)[reply]

I don't agree that that applies to the current guideline at all. This is already pretty useful in that it allows NPPs and AFC reviewers to decline an article outright with a link to WP:NEWLLM instead of having to waste time checking for source-text discrepancies and otherwise determining if the article is 'unreviewed' enough to meet G15.

Thw problem isn't that this guideline doesn't do what it's designed for, it's that it isn't designed to do enough. Athanelar (talk) 15:24, 24 November 2025 (UTC)[reply]

We should not be declining articles outright because of only suspected AI usage. qcne (talk) 15:25, 24 November 2025 (UTC)[reply]

We would still WP:AGF, obviously, just like we do with everything else. 〜 Festucalex • talk 15:30, 24 November 2025 (UTC)[reply]

AI usage can only ever be 'suspected' outside of some smoking guns like communication intended for the user and utm_source=chatgpt.com fingerprints, but there are enough signs that one can build up a very strong suspicion indeed. As festucalex says, AGF will always apply, but there's more than enough obvious, poor AI usage on Wikipedia for this guideline to be useful as it stands. Athanelar (talk) 15:41, 24 November 2025 (UTC)[reply]

Hard disagree. That's the entire point: AI usage can never be fully proven (unless we're talking about G15 criteria). So the blanket "LLM is banned" sentence wording in this current Guideline is nowhere near enough to deal with endless edge cases, and effectively bans LLM tools at all levels without any nuance. It's crap. qcne (talk) 15:48, 24 November 2025 (UTC)[reply]

Would you be more amenable to a clarification to the effect of Content which consensus determines to be AI-generated is subject to removal, including entire articles if their content is determined to be primarily AI-generated Athanelar (talk) 15:51, 24 November 2025 (UTC)[reply]

> No, because AI content in itself isn’t inherently a problem: it’s unreviewed or unverified AI content that breaches policy. The issue isn’t whether something was generated by an LLM, but whether it meets our existing standards.

I just generated the above reply with ChatGPT 5.1 Auto. It is exactly how I feel, I just ran it through the text transformer to make a point. This Guideline would effectively censor that reply? qcne (talk) 15:58, 24 November 2025 (UTC)[reply]

Whether AI content is inherently a problem is exactly what's being discussed. I don't agree that contributing AI-generated content to Wikipedia is fine as long as it doesn't breach policy, but I completely understand if you simply fundamentally disagree with that assertion.

My position is that even if I could hypothetically open up ChatGPT 7.0 and type 'Generate a Wikipedia article about foo which complies with all relevant policies and guidelines' and it would give me a completely acceptable output ready to be published to mainspace, I don't think I should be allowed to do that, because the end result would be that rather than Wikipedia being a passion project of human volunteers, it would be a glorified Grokipedia. The people responsible for Wikipedia's content would not be human volunteers who are here to build an encyclopedia but rather programmers at OpenAI.

There are biases and flaws inherent to large language models that are uncontrollable and antithetical to the philosophy of Wikipedia as a project. The rest of the internet is already being invaded and replaced en masse by machine-generated content, and it is absolutely vital that we prevent Wikipedia from going the same way. If we allow AI-generated content provided that it meets relevant standards, all we're doing is setting a timer for when the LLMs will be advanced enough to create Wikipedia-suitable output, at which point the entire website will be slowly overtaken by machine-generated output just like Google search results.

The argument that myself and others are making is that now is potentially our last chance to draw a line in the sand and declare that our active position is one of purging machine-generated output from Wikipedia in order to maintain the project's fundamental philosophy; free and open access to information. Athanelar (talk) 16:08, 24 November 2025 (UTC)[reply]

I completely agree that unreviewed LLM content should not be allowed. The current Guideline effectively bans LLM generated prose, both reviewed and unreviewed. I have no problem with LLM content that has been reviewed and is policy-compliant.

This quickly becomes a Ship of Theseus situation with the current guideline. If I generate some prose in ChatGPT and review 50% of it and edit that 50% to comply with policies. Is that banned? What about 75% of the content? What if I take out a thesaurus and change every word, but keep the layout as ChatGPT generated it? qcne (talk) 16:11, 24 November 2025 (UTC)[reply]

The problem is, how do we prove 'review?' If an LLM hypothetically becomes capable of generating Wikipedia-suitable content on its own, and a person inserts that content while chopping out all the utm_source fingerprints and user-directed communication etc; assuming there are no source-text discrepancies and the output is otherwise entirely acceptable for Wikipedia, has that been 'reviewed'? Should we accept it? Should we accept an increasing percentage of the encyclopedia consisting of that sort of content in the future?

My contention is that reviewed or unreviewed, the presence of AI-generated content on Wikipedia is philosophically against what we ought to stand for. I know I'm speaking in hypotheticals at the minute, but it's something we need to consider - if LLM technology is going to get better in the future, it's better we put our foot down on it now rather than saying 'it's fine for now as long as you review it' and then in a couple of years we have to frantically try to address it when human review no longer becomes necessary for AI output to pass Wikipedia's standards.

As for your ship of Theseus problem - my response to that is exactly what I've already proposed; that we should forbid and remove all content that consensus determines to be AI generated. Athanelar (talk) 17:39, 24 November 2025 (UTC)[reply]

+1 I'd axe the "limited acceptable uses" section also.

Additionally, While the "disclosure and responsibility" section sounds good, it will indicate to editors that LLM use is sanctioned in some form, and will encourage use in some cases. Better not to mention it at all. fifteen thousand two hundred twenty four (talk) 15:42, 24 November 2025 (UTC)[reply]

I understand your point of view, but LLM usage is sanctioned. We have plenty of experienced editors who regularly contribute to new articles using LLMs, and check the output carefully and ensure it's in-line with policies. AI is another research tool at this point. qcne (talk) 15:49, 24 November 2025 (UTC)[reply]

Currently LLM use is only sanctioned insofar as it's not forbidden by policy, but a disclosure requirement would introduce into policy an explicit allowance, thus encouraging use. I don't think any allowances should be incorporated into policy in any form. fifteen thousand two hundred twenty four (talk) 16:01, 24 November 2025 (UTC)[reply]

I hate AI slop, and my AfC and CSD log is littered with AI slop reports. But it's madness to ban a tool which is now used by 100~ million people a day. Which is double those who visit Wikipedia a day. qcne (talk) 16:04, 24 November 2025 (UTC)[reply]

You will notice with careful reading that neither of my comments involve entirely banning LLMs. They both concern not putting explicit allowances into policy which would encourage further use, which is the status quo. fifteen thousand two hundred twenty four (talk) 16:10, 24 November 2025 (UTC)[reply]

This is generally a sensible rewrite, and I would support it in an RfC. This is, however, not an RfC. I'm interpreting this as an RFCBEFORE/workshopping discussion before an RfC is opened. @Qcne, is that correct? Toadspike [Talk] 15:41, 24 November 2025 (UTC)[reply]

@Toadspike Initially I was going to open an RFC on WP:LLM to promote that to Guideline instead, but was told this has been tried repeatedly and failed. I wrote my version up as a condensed version of WP:LLM.

Guidelines are not set in stone, and I don't think we need yet another protracted RFC. My hope was to be bold and change the Guideline to something better and see if anyone reverts it. Someone did.

Maybe we can incrementally change and improve the existing Guideline using bits from my version, and feedback from other users.

Whatever happens though, this current version of the Guideline is awful. qcne (talk) 15:46, 24 November 2025 (UTC)[reply]

I heartily support this as the ideal guideline toward which we should be working. -- LWG ^talk 17:23, 24 November 2025 (UTC)[reply]

I also support this proposal, because as I and many others have pointed out, there is no reason to ban carefully reviewed, policy compliant content. Kovcszaln6 (talk) 17:34, 24 November 2025 (UTC)[reply]

Comment - this echoes WP:BOTMULTIOP on Disclosure & Competence which I think the current iteration of the guideline is missing. A shorter summary version linking to the full guideline could also be useful under Wikipedia:Bot policy#Other bot-related matters. Sariel Xilo (talk) 18:59, 24 November 2025 (UTC)[reply]

Version 2

See version 3 posted below

Based off feedback, I have revised my proposed full Guideline as below. This strengthens the scope of the LLM use, lessens the acceptable use section, and adds in some definitions. Full version at User:Qcne/LLMGuideline.

Version 2 of a suggested comprehensive Guideline.

== Purpose and scope ==

This guideline explains how editors may and may not use large language models (LLMs, also known as AI chatbots) when editing Wikipedia. It applies to all models and all LLM-generated output.

In this guideline, a large language model (LLM) means any programme that can generate natural-language text (sentences and paragraphs) in response to prompts. This includes tools marketed as "AI chatbots" or "AI writing assistants", such as ChatGPT, Google Gemini, Microsoft Copilot, and similar services, whether used in a browser, an app, or built into other software. It does not cover spellcheckers, grammar checkers, or basic autocomplete, although editors remain responsible for any text they accept from such tools.

Editors remain fully responsible for all edits they make, including LLM-assisted edits. All edits must comply with existing Wikipedia policies.

== Do not use an LLM to add unreviewed content ==

Editors must not use an LLM to add unreviewed text to Wikipedia, whether creating a new article or editing an existing one. Do not use an LLM as the primary author of a new article or a major expansion of an existing article, even if you plan to edit the output later.

For this guideline, unreviewed means output that the editor has not checked line by line for accuracy, sourcing, neutrality, and copyright problems against suitable reliable sources and against existing Wikipedia policies, at least as rigorously as if the editor had written the text themselves. You must verify every substantive claim against the cited sources. Reading the output to ensure it "sounds correct" is not sufficient.

In particular, editors should not:

Paste raw or lightly edited LLM output as a new article or as a draft intended to become an article.
Paste raw or lightly edited LLM output into existing articles as new or expanded prose.
Paste raw or lightly edited LLM output as new discussions or replies to existing discussions.

Where content is largely or entirely based on unedited or lightly edited LLM output, it may be draftified, stubified, nominated for deletion, collapsed, or removed entirely, especially where the content is unverifiable, fabricated, or otherwise non-compliant with existing Wikipedia policies.

== Why LLM-written content is problematic ==

LLMs are language generation tools. They generate plausible-sounding text. This leads to recurring problems that conflict with Wikipedia policies.

=== Unverifiable content, hallucinations, and original research ===

LLMs frequently:

State claims that are not supported by any published source.
Invent references, including plausible-looking but non-existent citations, books, articles, and URLs.
Combine or synthesise material from different sources into new or false conclusions.

=== Bias, BLP, and non-neutral point of view ===

LLMs reflect the biases of their training data and prompts. They may:

Omit important viewpoints or give undue weight to others.
Phrase contentious material in a way that appears neutral but is not.
Repeat, amplify, or invent serious unsourced allegations about living people.

=== Copyright and licensing ===

LLMs can reproduce or closely paraphrase copyrighted material from their training data, including books, paywalled journalism, and other non-free content not compatible with Wikipedia's licences. Their outputs may:

Contain verbatim or near-verbatim content that infringes copyright.
Be derivative works whose copyright status is unclear or incompatible with Wikipedia.

== Limited use ==

Editors are strongly discouraged from using LLMs. LLMs, if used at all, should assist with narrow, well-understood tasks such as copyediting. New editors should not use LLMs when editing Wikipedia.

If an experienced editor nonetheless chooses to use an LLM, they must:

Not use it to generate the bulk of a new article or major expansion.
Check the output they intend to use against suitable reliable sources.
Ensure the output complies with existing Wikipedia policies.
Not treat the output as authoritative or as a substitute for their own judgement.

If the editor cannot confidently check and correct the output, they should not use an LLM for that task. LLMs should not be used for tasks in which the editor has little or no independent experience. Editors should also be cautious about using LLMs to write comments in discussions on Wikipedia.

Repeatedly making problematic LLM-assisted edits may be treated as a competence issue and can result in the editor being blocked. Such blocks are intended to stop further disruptive use of LLMs on Wikipedia.

=== Disclosure and responsibility ===

Editors should disclose LLM assistance in the edit summary (e.g. "copyedited with the help of ChatGPT 5.1 Thinking"). This helps other editors understand and review the edit.

Regardless of disclosure:

Editors are wholly responsible for the content they add or change, including LLM-assisted text.
Disclosure does not make non-compliant content acceptable.
"The AI wrote it" is not a defence for violations of Wikipedia policy or guideline.

== Handling existing LLM-generated content ==

The mere fact (or suspicion) that content was generated by an LLM is not, by itself, a reason to delete or remove it. Editors should base their actions on specific problems with the content under existing policies.

Where content appears to be substantially or wholly LLM-generated without review and does not comply with Wikipedia policy, editors may:

Remove the problematic material outright, especially in biographies of living persons.
Replace it with a sourced, policy-compliant content.
Tag the page as LLM generated under Template:AI-generated.
Draftify, stubify, or nominate for deletion under the usual processes.
Mark the page for speedy deletion under criteria G15.

Again, happy to take feedback and I have also calmed down a little from this afternoon's shock at the original RFC being closed. qcne (talk) 20:57, 24 November 2025 (UTC)[reply]

Thanks for doing this. The new version is a big improvement, and I would gladly support its promotion to Guideline. -- LWG ^talk 21:28, 24 November 2025 (UTC)[reply]

Support this as well. I especially like the disclosure part here. I think proposal #1 is a bit more straightforward, but this works too, and is way more detailed. It's definitely a major improvement over the current guideline, where the FAQ specifically stated that the proposal was vague on purpose to gain consensus. Z E T A^C 21:36, 24 November 2025 (UTC)[reply]

This is great, but why not just say LLMs should not be used to generate content. This excepts copyediting etc. Maybe put that in the nutshell? Kowal2701 (talk) 21:36, 24 November 2025 (UTC)[reply]

It is not that simple: is an AI translation from a foreign language OK? Викидим (talk) 21:53, 24 November 2025 (UTC)[reply]

No. LLM translation is one of the uses we specifically discourage, because without the output being checked by someone fluent in both languages there's no way to verify that it's an accurate and reliable translation of the source text; and if you have someone fluent in both languages available, they might as well just translate it themselves to begin with, since it's going to be just as much effort to double-check the resulting output. Athanelar (talk) 21:56, 24 November 2025 (UTC)[reply]

Using machine translation to generate a draft from a source that will then be carefully reviewed and edited by translator is a pretty common practice in translation and should not be prohibited. It's generally not true that a fluent speaker can translate from scratch just as easily as they can revise a machine translation (if that were true machine-assistance wouldn't be widely used by basically all professional translators). The kinds of errors machines make in translation specifically are also much better understood than the kinds of errors LLMs make in generating content from prompts, so it's easier for a competent translator to catch and fix translation errors than for a competent writer to catch and fix the kinds of issues that LLM-generated prose tends to have. The critical key here is machine-translated draft text must be gone over line by line by a fluent speaker of both languages who also has transfer competence, that is, awareness of the types of errors and misunderstandings that commonly come up in translation and how to mitigate them. -- LWG ^talk 22:55, 24 November 2025 (UTC)[reply]

Most professional translators use machine translation (AI-based or otherwise). It's a significant time saver. WhatamIdoing (talk) 06:15, 27 November 2025 (UTC)[reply]

I have professional translation experience. I have never used machine translation. In fact, I have had the misfortunate to spend an inordinate amount of time cleaning up after garbled machine translations produced by wayward 'professional' translators who couldn't discern even the most evident nuance in either source or target language. Of course, the quality of machine translation varies greatly by language pair, but as I said during the RfC, machine translation wastes more time than it saves. A good translator will spend less time producing his own translation than trying to make sense of machine-mangled dross. Yours, &c. RGloucester — ☎ 08:55, 27 November 2025 (UTC)[reply]

I'm not a professional translator, but I've had the misfortune of cleaning up after human translators who miss (e.g.,) the distinction between "free as in beer" and "free as in freedom" even after they were told to watch out for that mistake. But not every language, not every translator, not every tool is perfect. Most translators use machine translation, though one survey said that only about half run the whole document through MT and then edit the results. However, there are other ways to use MT, some of which are not very different from using a dictionary. WhatamIdoing (talk) 02:18, 28 November 2025 (UTC)[reply]

That's already covered by WP:MACHINETRANSLATION. Sariel Xilo (talk) 22:02, 24 November 2025 (UTC)[reply]

The MACHINETRANSLATION talks about unedited machine translation. If the guideline would prohibit unedited AI output, I would have no problem here. Викидим (talk) 22:22, 24 November 2025 (UTC)[reply]

This is way better as a starting point. To stay on the sane side, editors need carve-outs that provide safe harbors ("within reasonable limits, this particular behavior is OK, and, for alleged misuse, the burden of proof is on the accuser"). Note that if the use of the tolls were explicitly permitted for some tasks, I would argue the opposite: the editor using them should shoulder the burden of proof that their use was beneficial against "this is slop" accusations.Викидим (talk) 21:47, 24 November 2025 (UTC)[reply]

I think this is 90% of the way there, but I still have some issues with it. Namely, this is essentially a proposal for a completely different, less restrictive guideline; it's more like promoting the WP:LLM essay to guideline rather than iterating on NEWLLM.

Editors should also be cautious about using LLMs to write comments in discussions on Wikipedia. this appears to undermine the well-regarded essay subsection WP:LLMCOMM which states the more firm Editors should not use LLMs to write comments generatively. I think having a chatbot talk on your behalf is a pretty clear competence/communication is required issue which we should forbid, not caution.

Do not use an LLM to add unreviewed content This is essentially the bulk of your proposed guideline, but it seems to walk back what's been agreed upon in NEWLLM, which is that articles shouldn't be created wholesale with LLMs, regardless of whether they're reviewed or not. Now, whether that should apply to all content is a separate matter, of course, but your proposed guideline would essentially be a reversal of NEWLLM and would again permit wholly AI-generated articles, provided they were reviewed. If you're proposing this as an iteration of NEWLLM, it should reflect the agreed-upon prohibition of AI generating entire new articles, while stressing that unreviewed AI edits in general are not allowed. It seems like your new guideline wouldn't actually change anything from the consensus status-quo prior to the promotion of NEWLLM.

I do think in general you're swimming against the cultural current here. My takeaway from the NEWLLM RFC is that people (myself included, so I'm certainly biased) want more restrictions on AI usage, we don't want to codify carveouts and exceptions.

The mere fact (or suspicion) that content was generated by an LLM is not, by itself, a reason to delete or remove it. Ditto as above. This would be a direct reversal of NEWLLM, which is explicitly saying the opposite; that new articles generated 'from scratch' using an LLM are not allowed.

I would support this on the grounds that it's changed to at least explicitly prohibit the generation of new articles entirely or primarily with LLMs, because the RFC conclusion for NEWLLM has clearly determined that that restriction is approved by community consensus. In future I would propose a modification of it to forbid all AI-generated text, but that's my own agenda here anyway, of course. At a minimum, though, if this guideline is to serve as an expansion of what's been codified here at NEWLLM, it needs to explicitly forbid the creation of new articles with AI output. Athanelar (talk) 21:48, 24 November 2025 (UTC)[reply]

Any blanket prohibition on LLM article creation or LLM use will never be absolute because all rules on Wikipedia have exceptions. SuperPianoMan9167 (talk) 22:29, 24 November 2025 (UTC)[reply]

Relying on IAR daily is wrought with peril, IMHO. It is much better to codify the safe AI use, if any. Викидим (talk) 22:32, 24 November 2025 (UTC)[reply]

We already rely on IAR daily, for example WP:SNOW. It’s the unconventional uses that are perilous. There’s a trade off between making a guideline/policy that perfectly addresses the problem, and one that all users will easily understand, simplicity is valuable here Kowal2701 (talk) 22:41, 24 November 2025 (UTC)[reply]

Of course, but that doesn't mean the rule shouldn't exist in the first place. Vandalism is the perfect example. Everybody knows vandalism isn't allowed on Wikipedia. There's no objective standard for what 'vandalism' is, because it's something obviously impossible to define in a way that includes everything that a reasonable person would agree is vandalism and excludes everything that a reasonable person would agree isn't vandalism and respects the principle of AGF etc etc etc. Despite all those things, we can still say 'Vandalism isn't allowed on Wikipedia, vandalism will be removed, and vandals will be prevented from vandalising the wiki' because we're all sensible people capable of judging what is and isn't detrimental to Wikipedia and acting accordingly.

Saying 'creating articles wholesale from primarily AI-generated text is forbidden' is obviously going to have edge cases and exceptions, but the core of it is that if you create a new article, and reasonable consensus determines that the text of that article is primarily AI generated, the article is subject to removal.

"IAR exists" doesn't mean "we should never prohibit anything" Athanelar (talk) 22:51, 24 November 2025 (UTC)[reply]

Newbies using LLMs are not intentionally trying to disrupt Wikipedia, and so you cannot compare them to vandals. Most of them have no idea how hostile the community is to LLMs and likely don't know why their outputs are inherently unreliable, likely because the hype around LLMs obscures their shortcomings in favor of promoting investment and public interest. SuperPianoMan9167 (talk) 22:59, 24 November 2025 (UTC)[reply]

I didn't compare newbies using LLMs to vandals. I used vandalism as an illustrative example of why a prohibition being hard to strictly define and subject to exceptions doesn't mean the prohibition shouldn't exist. Athanelar (talk) 23:01, 24 November 2025 (UTC)[reply]

That makes sense. Sorry. I've seen the comparison of LLM users to vandals elsewhere on this talk page and I wanted to address it. SuperPianoMan9167 (talk) 23:04, 24 November 2025 (UTC)[reply]

This is thoughtfully written and I like the framework, but I cannot support a guideline that allows "reviewed LLM output" when the evidence is overwhelming that such review is almost always insufficient. I see zero argument that newer editors in particular should be allowed to use LLMs to generate article content under any circumstance. If we want to allow a carve out for reviewed LLM generated content, we should create an LLM-user user right with the same requirements as autopatrolled. NicheSports (talk) 21:53, 24 November 2025 (UTC)[reply]

I think that an llm-user right makes perfect sense, and had proposed it all the way through the RfC. Викидим (talk) 21:54, 24 November 2025 (UTC)[reply]

+1 to both Niche and Athanelar Kowal2701 (talk) 22:23, 24 November 2025 (UTC)[reply]

This is far too bloated with needless words and yet achieves very little– the opposite of what I wanted this guideline to achieve. This would be better given these changes:

== Purpose and scope ==

~~This guideline explains how editors may and may not use large language models (LLMs, also known as AI chatbots) when editing Wikipedia. It applies to all models and all LLM-generated output.~~ Get to the point; everyone knows what a guideline is.

In this guideline, a large language model (LLM) means any programme that can generate natural-language text (sentences and paragraphs) in response to prompts. This includes tools marketed as "AI chatbots" or "AI writing assistants", such as ChatGPT, Google Gemini, Microsoft Copilot, and similar services, whether used in a browser, an app, or built into other software. It does not cover spellcheckers, grammar checkers, or basic autocomplete, although editors remain responsible for any text they accept from such tools.

~~Editors remain fully responsible for all edits they make, including LLM-assisted edits. All edits must comply with existing Wikipedia policies.~~ This always applies; there's no need to restate our basic editing principles in every guideline

== Do not use an LLM to add ~~unreviewed~~ content Anyone imbecilic enough to want to use an LLM to add substantial new content to an article can't be trusted to review it properly. We have seen time and time again that "review" is unsufficient to counteract the inherent problems ==

Editors must not use an LLM to add ~~unreviewed~~ text to Wikipedia, whether creating a new article or editing an existing one. Do not use an LLM as the primary author of a new article or a major expansion of an existing article, even if you plan to edit the output later.

For this guideline, unreviewed means output that the editor has not checked line by line for accuracy, sourcing, neutrality, and copyright problems against suitable reliable sources and against existing Wikipedia policies, at least as rigorously as if the editor had written the text themselves. You must verify every substantive claim against the cited sources. Reading the output to ensure it "sounds correct" is not sufficient.

In particular, editors should not:

Paste raw or lightly edited LLM output as a new article or as a draft intended to become an article.
Paste raw or lightly edited LLM output into existing articles as new or expanded prose.
Paste raw or lightly edited LLM output as new discussions or replies to existing discussions.

Where content is largely or entirely based on unedited or lightly edited LLM output, it may be draftified, stubified, nominated for deletion, collapsed, or removed entirely, especially where the content is unverifiable, fabricated, or otherwise non-compliant with existing Wikipedia policies.

~~== Why LLM-written content is problematic ==~~

~~LLMs are language generation tools. They generate plausible-sounding text. This leads to recurring problems that conflict with Wikipedia policies.~~ Guidelines are not information pages

{{{1}}}

Repeatedly making problematic LLM-assisted edits may be treated as a competence issue and can result in the editor being blocked. Such blocks are intended to stop further disruptive use of LLMs on Wikipedia. This is all irrelevent – not to mention redundant to WP:LLM. Bloat like this is what turns useful guidelines into mazes of dry prose wherein it cannot be distinguished what is actual guidance and what is useless "background" information.

{{{1}}}

~~"The AI wrote it" is not a defence for violations of Wikipedia policy or guideline.~~ No, they shouldn't use it, period.

== Handling existing LLM-generated content ==

The mere fact (or suspicion) that content was generated by an LLM is not, by itself, a reason to delete or remove it. Editors should base their actions on specific problems with the content under existing policies.

Where content appears to be substantially or wholly LLM-generated without review and does not comply with Wikipedia policy, editors may:

Remove the problematic material outright, especially in biographies of living persons.
Replace it with a sourced, policy-compliant content.
Tag the page as LLM generated under Template:AI-generated.
Draftify, stubify, or nominate for deletion under the usual processes.
Mark the page for speedy deletion under criteria G15.

Don't shoehorn in useless side information that obscures the point of the guideline and bloats it into an information page, while also watering down the actual guideline to be next to useless. I think you've completely missed the point of what this guideline is trying to accomplish. Cremastra (talk · contribs) 02:03, 25 November 2025 (UTC)[reply]

@Cremastra, as a side note, could I introduce you to Template:Text diff? Use separate instances if you want to interpolate explanations. WhatamIdoing (talk) 06:18, 27 November 2025 (UTC)[reply]

Some initial notes:

Many "grammar checkers" use AI now (Grammarly being the main example), and can go well beyond what most people would consider basic proofreading.
"Hallucinations" section should mention something about AI text claiming that real sources contain information or analysis that they actually don't. This is the big issue with current chatbots and I think it gets overshadowed by "AI can make up fake sources."
Because of that, there should also be something saying "don't use AI to cite or summarize a source you didn't personally read."
Non-neutral point of view should add something about promotional/didactic/editorializing tone as this is a major issue, and also point out that AI can do this even when it claims it's making the text neutral.
Mention that AI can surface or cite unreliable sources, even when it claims they're reliable. (the "cites blog or free web host" tag gets a workout)

Gnomingstuff (talk) 02:40, 25 November 2025 (UTC)[reply]

Yes, my experience with AIs is that they constantly hallucinate information, quotes, etc in sources that appears nowhere in the text, often to the opposite implication of what the source states. Katzrockso (talk) 02:53, 25 November 2025 (UTC)[reply]

There is a very easy way to prevent AI from hallucinating sources. This is a two-step process: (1) human editor identifies the sources and uploads their text into AI along with bibliographic information and (2) in the prompt, editor explicitly states to "use only provided sources". That's it, the problem of sources is gone - AI will not hallucinate sources unless it is allowed to. Coaxing AI into creating OR by not providing sources or not explicitly prohibiting it to find guarantees the issues. To me, this is simply an aspect of the use of a complex tool, that can be solved by education (and blocking the ones that are unable to learn). Викидим (talk) 05:45, 25 November 2025 (UTC)[reply]

This doesn't always work and I have an example, will share tomorrow NicheSports (talk) 05:47, 25 November 2025 (UTC)[reply]

I use Google Gemini 2.5 and now 3.0 for work extensively, and never had an issue with it not following "use only provided sources" instruction. I would be very interested in seeing the counter-example, naturally. Викидим (talk) 06:09, 25 November 2025 (UTC)[reply]

AI will not hallucinate sources unless it is allowed to. – Models are not so perfectly constrained, there is no fundamental blocker to a model hallucinating a new reference, or permuting a provided one. Hallucinations are fundamental and cannot be entirely obviated just by providing a "better prompt". fifteen thousand two hundred twenty four (talk) 05:53, 25 November 2025 (UTC)[reply]

As I have said, my opinion is based on personal experience. One of the main uses of AI is to summarize long texts for human consumption, so all model makers are very careful not to ruin this experience, so instruction "just use what you were shown" is well-tested, and lapses of AI judgement (in this respect) should have low probability. Викидим (talk) 06:13, 25 November 2025 (UTC)[reply]

Personal experience does not override the fundamental fact that these are, at their core, predictive models, and they will predict wrong. This is unsolvable with current approaches.

Also my experience is that they are poor at summarizing content, a recent Wikipedia-specific example would be the simple summaries debacle [3][4] which you may have missed. fifteen thousand two hundred twenty four (talk) 08:26, 25 November 2025 (UTC)[reply]

We are both entitled to our own opinions. Currently, mine is that the capability of modern AI models on many text-processing tasks rivals that of a skilled human professional. I am referring to relatively standard tasks for now, but the shift is palpable. As an example, less than an hour ago, I completed a project that previously would have required the services of a specialist. These types of projects used to take a week; I completed this one single-handedly in three stress-free days. What was most disturbing, it actually felt like collaborating with a technical expert. I could guide it: 'Shouldn't the value of X be zero here?' It would agree, provide the mathematical proof, and correct the handling of the condition. For reference, just a year ago, a similar attempt resulted in pages of incoherent text that the AI stubbornly insisted were correct. Consequently, I believe the current discourse on using AI in Wikipedia is largely driven by editors observing the output of free models of yesteryear used without proper prompting. To be clear, the vision of AI surpassing humans unsettles me, too, but this is the inescapable future. Humans must learn to use these new tools; those who don't will simply lose out to those who do. (AI disclosure: I have asked Gemini 3.0 to improve this text, the result certainly sounds more robotic, but feels way more readable now. For the avoidance of doubt, I do not use, and never used, AI to write of polish my comments in English, this was a one-off demonstration). Викидим (talk) 11:29, 25 November 2025 (UTC)[reply]

Your comments were perfectly readable before... Gnomingstuff (talk) 20:02, 25 November 2025 (UTC)[reply]

And if they hadn't been, he wouldn't be qualified to assess whether the LLM output was an improvement. -- LWG ^talk 20:13, 25 November 2025 (UTC)[reply]

That's not really true. If you know your work is weak (e.g., dyslexia, English as a second or third language), then you can often recognize improvements when they're shown to you, even if you couldn't come up with the better option on your own. WhatamIdoing (talk) 06:23, 27 November 2025 (UTC)[reply]

Not sure I agree. I can bang out some freshman 101-level Spanish, but if someone were to rewrite that into other Spanish text, I would have no idea whether it was an improvement, or for that matter whether it even meant the same thing. Or if I chose a random article on the Spanish Wikipedia - hitting random gave me the the article for "calculator" - I don't know how good or accurate it is. I can guess that certain things sound good or questionable -- e.g., LLM(Grandes Modelos del Lenguaje) que crean y adaptan calculadoras, abriendo un nuevo paradigma -- but I don't actually know because I'm not a fluent Spanish speaker. Gnomingstuff (talk) 06:00, 30 November 2025 (UTC)[reply]

That Spanish means "Large language models that create and adapt calculators, opening a new paradigm". SuperPianoMan9167 (talk) 06:05, 30 November 2025 (UTC)[reply]

I know basically what it means. But the nuances of the language -- is this sentence structure idiomatic? is it normal to use an English acronym like this in Spanish? does paradigma have the same promotional connotations in Spanish as it does in English? does the present-participle thing apply to Spanish AI text? -- are beyond my level of understanding of Spanish. And that's just one sentence with closer cognates than usual, in a language that does share cognates with English. Gnomingstuff (talk) 20:46, 30 November 2025 (UTC)[reply]

Got it. SuperPianoMan9167 (talk) 20:47, 30 November 2025 (UTC)[reply]

While agreeing with you on details, I would still posit that identifying the traits is easier than understanding the same traits and much easier than reproducing the traits. After all, through evolution humans got really good at guessing the whole picture while seeing just parts of it (the ones who were bad at it became food for tigers hidden in the overgrowth of the jungle). I thus fully expect myself to be capable to appreciate the beauty of texts written in a foreign language without any ability to write equally good texts in the same way I can appreciate the classical music without any skills to create it. Викидим (talk) 17:56, 1 December 2025 (UTC)[reply]

late to the party, support thinking of this as a sensible policy proposal.

Think main question is:
A) apparently it is now possible to continuously iterate and prompt-engineer over time an article. i.e. you can tell the ai that fact A was wrong, and the ai can regenerate without the fact. Does this count as line-by-line verification? i think no.
B) maybe include a link at the end to WP:AINB? and also info on when to escalate if a user refuses to stop doing LLM incorrectly? i.e. An editor caught using LLMs to generate and insert text repeatedly, and with such text not complying with other Wikipedia policy, can be reported to ANI. Do not report an editor for suspicion of LLM usage, but who has otherwise followed other Wikipedia policy. User:Bluethricecreamman (Talk·Contribs) 06:06, 1 December 2025 (UTC)[reply]

Oppose. For one I do not agree the current guideline os bad it is a good starting point.It can be improved but in my view that improvement should be expanding it to cover generating content for an article so this proposal does not go far enough in my view.GothicGolem29 (Talk) 18:55, 28 November 2025 (UTC)[reply]
This is a sensible policy proposal. My only quibble is that under "limited use" it should read "Editors are strongly discouraged from using LLMs to add content." --Enos733 (talk) 18:17, 2 December 2025 (UTC)[reply]
Weak support: I guess this is the one that's emerging as the preferred revision. I have some nitpicks still about the text and frankly about the content (my ideal policy is basically NicheSports' below), but we don't have time to keep bikeshedding this shit. Going three years with AI policy has done substantial damage to the integrity of Wikipedia, and frankly we're probably near the precipice where that damage becomes irreversible. Gnomingstuff (talk) 02:49, 3 December 2025 (UTC)[reply]
Strong oppose: As I stated above for v1, I oppose anything that codifies "acceptible" LLM use, indicating to editors that there are forms of model use that are specificially sanctioned will further encourage LLM-editing, to the detriment of the project. Currently usage is only allowed insofar as it is not explicitly banned, and this is how it should remain at most.

For me, this means no explaining how editors may and may not use models, only how they may not, no Editors remain fully responsible, including LLM-assisted edits (superfluous anyways), no Limited Use, etc.

I also oppose any form of disclosure requirement, requiring disclosure would again, be explicitly sanctioning LLM use. See prior relevant discussions at WP:Village pump (policy)/Archive 205#Checkbox for disclosure and WP:Village pump (idea lab)/Archive 47#Adding LLM edit tag where similar concerns were expressed.

I'm wary of bikeshedding, same as Gnomingstuff, but adding a guideline that would explicitly endorse LLM editing is far from a trivial concern and requires careful consideration. fifteen thousand two hundred twenty four (talk) 04:22, 3 December 2025 (UTC)[reply]

+1 The effort spent 'bikeshedding' now will be nothing in comparison to trying to enact more LLM restrictions down the road if we set something too permissive in stone now. Athanelar (talk) 05:42, 3 December 2025 (UTC)[reply]

And in the meantime we have no policy. Gnomingstuff (talk) 06:07, 3 December 2025 (UTC)[reply]

Extending the "meantime" is much preferred if it means we don't end up with a policy encouraging LLM use. I'm not seeking perfection, the Do not use an LLM to add unreviewed content section of qcne's proposal would be sufficient on its own. Much of the rest reminds me of an omnibus bill, packaging a policy that the community has previously indicated support for (via NEWLLM and G15) with unnecessary extras that the community has not (some LLM use is OK actually). fifteen thousand two hundred twenty four (talk) 06:29, 3 December 2025 (UTC)[reply]

Having no policy effectively is encouraging LLM use. The longer we remain in that state, the longer we are encouraging LLM use, and the longer it accumulates on Wikipedia. Gnomingstuff (talk) 13:02, 3 December 2025 (UTC)[reply]

Well, we have NEWLLM as it stands, which is already I would argue more restrictive (and therefore preferable to my own ideology on this) than qcne's proposal. Athanelar (talk) 13:51, 3 December 2025 (UTC)[reply]

Version 3

I still believe this Guideline is grossly short and needs to be expanded a little bit, but am also taking into account the feedback given.

Would my much shorter Version 3 guideline here be at all acceptable to the more hard-line anti-LLM editors? I have:

made it more concise.
removed the limited use carve-out, with the idea that experienced editors can be trusted to use LLMs, and this Guideline is more focused towards new editors.

Hidden ping to users who have participated. qcne (talk) 22:37, 3 December 2025 (UTC)[reply]

I predict that hard-line anti-LLM editors will still want the word "unreviewed" removed from "do not add unreviewed LLM-generated content to new or existing articles". SuperPianoMan9167 (talk) 22:40, 3 December 2025 (UTC)[reply]

Yes, potentially, but I would like to have some sort of compromise! qcne (talk) 22:41, 3 December 2025 (UTC)[reply]

Agreed. SuperPianoMan9167 (talk) 22:42, 3 December 2025 (UTC)[reply]

The compromise on the reviewed language is to only allow it for experienced editors with an llm-user right. A few editors have suggested this. There is a vast amount of evidence (AfC, NPP, 1346 (hist · log), any WikiEd class, etc.), that inexperienced editors essentially never sufficiently review LLM-generated prose or citations. NicheSports (talk) 23:31, 3 December 2025 (UTC)[reply]

I think that'd have to be a separate RfC, would support Kowal2701 (talk) 23:32, 3 December 2025 (UTC)[reply]

Given my experience with CCIs of autopatrolled and NPR editors, and even the odd admin, would you be offended if I scream "NO!" really loudly at the idea of tying LLM use to a user right?

Sorry, but I've had too much trouble with older users being grandfathered in to the autopatrolled system to be comfortabel with the idea of giving somebody the right to say "Oh, but my use of Chat GPT is fine - I have autopatrolled!" GreenLipstickLesbian 💌🧸 23:48, 3 December 2025 (UTC)[reply]

Valid point. There's been at least one editor who had their autopatrolled right revoked for creating unreviewed LLM-generated articles. SuperPianoMan9167 (talk) 23:53, 3 December 2025 (UTC)[reply]

far from being offended, I actually laughed 😅 but I would still much much rather deal with that problem than continuing the fantasy that inexperienced editors should be allowed to use these tools with review that is never performed! NicheSports (talk) 23:56, 3 December 2025 (UTC)[reply]

Disagree with adding an LLM-user right, but either way I think that is best workshopped elsewhere. fifteen thousand two hundred twenty four (talk) 23:55, 3 December 2025 (UTC)[reply]

The issue with "unreviewed" is that it is at risk of being wikilawyered, even a bad review would be kosher. Otherwise it's great. I worry that by having a nuanced approach, it'd struggle to communicate a clear message, especially since people dispositioned to use LLMs likely already have CIR issues that LLM-use is compensating for. I'd remove "unreviewed", and especially where the content is unverifiable, fabricated, or otherwise non-compliant with existing Wikipedia policies can support people's IAR "not what the policy was intended for" (if they so want) in the fringe cases LLM-use is not practically problematic, subject to consensus. Kowal2701 (talk) 23:08, 3 December 2025 (UTC)[reply]

"insufficiently reviewed" has more wiggle room while still allowing for the edge cases; once any problem is identified, it puts the responsibility on the person adding the content rather than other editors. GreenLipstickLesbian 💌🧸 23:43, 3 December 2025 (UTC)[reply]

That'd be good too Kowal2701 (talk) 01:30, 4 December 2025 (UTC)[reply]

Honestly, "unreviewed" has been my main point of disagreement in every proposal that includes it -- thank you for articulating it. There are two fundamental problems:

First, if it's hard to know whether someone used AI, it's even harder to know how much they reviewed it.

Second, and more problematic: Properly "reviewing" LLM content means that every single word, fact, and claim needs to be verified against every single source. You essentially need to reconstruct the writing process in reverse, after the fact. But most good-faith editors who use AI seem to think "reviewing" means one of two things:

Quickly skimming it and going "yeah that looks OK."
Using AI to "review" the text.

This results, and will continue to result, in the following situation: Editor 1 finds some bad AI text. Editor 1 says that the AI text wasn't reviewed, and they aren't wrong. Editor 2 says that they did review the AI text, and they aren't lying. Meanwhile, the text remains bad. Gnomingstuff (talk) 01:31, 5 December 2025 (UTC)[reply]

Enthusiastic support. I think this is the best we're going to get for a compromise option between the two LLM ideologies here.

You don't leave any room for 'acceptable' carve-outs, you've included the very direct "Editors should not use an LLM to add content to Wikipedia, whether creating a new article or editing an existing one." which, although it uses 'should' and not 'must,' serves to discourage LLM use in general, which is very desirable for me. You've preserved the spirit of NEWLLM by categorically saying "Do not" use an LLM to author an article or major expansion, you've codified LLMCOMM by saying "Do not" use LLMs for discussions.

My only suggested change would be to drop the "Why LLM content is problematic" section. We already have that covered at WP:LLM, there's no need to bloat this guideline by including it here. Other than that, I think this is exactly the kind of AI guideline we should have right now. Athanelar (talk) 23:11, 3 December 2025 (UTC)[reply]

If we do that we should probably make WP:LLM an information page. SuperPianoMan9167 (talk) 00:13, 4 December 2025 (UTC)[reply]

I think that's totally fine. We can link to it from qcne's proposal (and even promote it to supplement if necessary). It's better than adding unnecessary bloat to the guideline. The main target for this guideline, after all, is going to be people who are already using AI for something and need to be told to stop, who probably aren't going to be interested in the finer points of why LLM use is problematic. If they want to do the further reading, they can. Athanelar (talk) 03:10, 4 December 2025 (UTC)[reply]

I did it. SuperPianoMan9167 (talk) 03:16, 4 December 2025 (UTC)[reply]

Awesome, thank you @SuperPianoMan9167. qcne (talk) 11:17, 4 December 2025 (UTC)[reply]

I was reverted. I did say people could do that when I made the change. SuperPianoMan9167 (talk) 16:46, 4 December 2025 (UTC)[reply]

I appreciate your work here. I do think what you have makes sense and also is realistic in how editors work. As for "unreviewed," could a footnote work to explain what "reviewed" means? - Enos733 (talk) 23:14, 3 December 2025 (UTC)[reply]

I'd like that. FTR I'd still support this regardless as it's a massive improvement Kowal2701 (talk) 23:22, 3 December 2025 (UTC)[reply]

Your ping missed me, but I really like the version 3 proposal. I agree with GreenLipstickLesbian that "insufficiently reviewed" would be better verbiage, but it's not a blocker. This would have my support as-is. Adding raw or lightly edited LLM output degrades the quality of the encyclopedia, and frequently wastes the time of other editors who must then cleanup after it. This proposed guideline would explicitly prohibit such nonconstructive model use in a clear manner, and would serve as a useful tool for addressing and preventing instances of misuse. fifteen thousand two hundred twenty four (talk) 00:11, 4 December 2025 (UTC)[reply]

Support I like it. Since that's not an argument, I also think this is finally a version Randy in Boise can understand and follow. ~ Argenti Aertheri^(Chat?) 01:51, 4 December 2025 (UTC)[reply]

Serious concern: isn't this proposal contradictory? How can both of these statements be in the same guideline?

Do not use an LLM as the primary author of a new article or a major expansion of an existing article, even if you plan to edit the output later. (Emphasis my own)
Editors should not... Paste raw or lightly edited LLM output into existing articles as new or expanded prose. #2 strongly implies it is fine to add reviewed LLM content. But this directly contradicts #1. NicheSports (talk) 02:06, 4 December 2025 (UTC)[reply]
These do not read as contradictory to me. Nowhere in #1 does it prohibit LLM use.

even if you plan to edit the output later means editors cannot immediately add LLM output to the project with an excuse of "I'll fix it later", they must fix it first before it can be added at all. fifteen thousand two hundred twenty four (talk) 02:26, 4 December 2025 (UTC)[reply]
I'm not sure about that interpretation... what about the first part of that sentence: Do not use an LLM as the primary author.... Still pretty contradictory. Either you can use an LLM to generate a bunch of text and then edit it, or you can't. This guideline, as written, plays both sides NicheSports (talk) 02:47, 4 December 2025 (UTC)[reply]
I don't follow. #1 applies to edits which create new articles or are major expansions, situations where majority-LLM authorship would be especially undesirable, and so that is explicitly disallowed. #2 applies to editing in general, where raw or lightly-edited LLM content is disallowed. Maybe you could pose a hypothetical editing scenario where you believe a contradiction would occur, and that would help me understand your point better. fifteen thousand two hundred twenty four (talk) 03:15, 4 December 2025 (UTC)[reply]
Oh. With this interpretation, I would support! But if I don't understand this I guarantee you a lot of the non-native English speakers who are using LLMs would miss the distinction. Can we clarify the wording? NicheSports (talk) 03:19, 4 December 2025 (UTC)[reply]
It reads well to me, so I'm not sure what changes could be made, @Qcne may have some suggestions? fifteen thousand two hundred twenty four (talk) 03:30, 4 December 2025 (UTC)[reply]
I mean the header needs to be changed but it could just be changed to "Rules for using LLMs to assist with article content" or something neutral. We should specify that #1 above are rules for "major content additions" while #2 is rules for "minor content additions". NicheSports (talk) 03:31, 4 December 2025 (UTC)[reply]
I do prefer the current Do not use an LLM to add unreviewed content header, it communicates up-front what the most basic requirement is before providing more detail below.

#1 does already specify that it concerns new articles or a major expansions, and #2 already applies to all editing, narrowing its scope would introduce another point of argumentation (define "minor" vs "major"). The grammatical clarity could maybe be improved, but right now it's in good enough condition for adoption, and as said prior, I'm wary of bikeshedding. fifteen thousand two hundred twenty four (talk) 03:51, 4 December 2025 (UTC)[reply]
I also think we need to be wary of any headline like the suggested "Rules for including LLM content" for fear of implying permission. I do think the "do not" header is the best way to go about it, and the way it's currently written is fine for a compromise guideline which isn't aiming to be a total ban. Athanelar (talk) 04:00, 4 December 2025 (UTC)[reply]

The categories could just be "New articles or major expansions" and "General considerations". Could just be a bolded title before each section. That would be enough to make it clear (I support your interpretation but completely missed it when I first read). I disagree with the "unreviewed content" header, because it does contradict the guideline's language for new articles and major edits, and is going to confuse the heck out of newer editors, but I guess I can live with it for now. NicheSports (talk) 04:05, 4 December 2025 (UTC)[reply]

Comment could you remove a word from the second heading -- "Do not use an LLM to add unreviewed content" -) "Do not use an LLM to add content"? Using AI to add content to Wikipedia goes against the spirit of the consensus developed in the RFC. Mikeycdiamond (talk) 02:46, 4 December 2025 (UTC)[reply]
Called it. SuperPianoMan9167 (talk) 02:56, 4 December 2025 (UTC)[reply]

3rd time is truly a charm. I really like his one. Викидим (talk) 02:54, 4 December 2025 (UTC)[reply]

Remove the entire "Why LLM-written content is problematic" section. As I've said before, guidelines aren't information pages. Remove unnecessary words.
Change to: "Do not use an LLM to add ~~unreviewed~~ content"
"Handling existing LLM-generated content" – good section. Thumbs up from me on this one.

Cremastra (talk · contribs) 03:06, 4 December 2025 (UTC)[reply]

If guidelines aren't information pages, then shouldn't WP:LLM be tagged as an information page? SuperPianoMan9167 (talk) 03:08, 4 December 2025 (UTC)[reply]

IMO, yes, because that's what it is – it provides useful information on why LLMs are problematic and factual tips to handle and identify them. Cremastra (talk · contribs) 03:10, 4 December 2025 (UTC)[reply]

Done in Special:Diff/1325613952. WP:LLM is now an information page. SuperPianoMan9167 (talk) 03:16, 4 December 2025 (UTC)[reply]

When/if qcne's guideline goes live, we must remember to add it to the information page template there as a page that is interpreted by it. Athanelar (talk) 03:31, 4 December 2025 (UTC)[reply]

I got reverted. SuperPianoMan9167 (talk) 16:44, 4 December 2025 (UTC)[reply]

Guidelines aren't information pages, true, but you do need need to explain to people why the guideline exists; Wikipedia attracts far too many free-thinking, contrarian, and libertarian types who like asking "why?" and will resist a nameless figure telling them what to do unless they're provided a reason to do otherwise. GreenLipstickLesbian 💌🧸 03:10, 4 December 2025 (UTC)[reply]

Guidelines should absolutely link – prominently! – to pertinent information pages, and give a one or two-sentence explanation of why the guideline is necessary. But whole sections dedicated to justifying its existence mean that the important parts are covered by clouds of factual information rather than principled guidance, which is confusing for new editors, who need the guidelines most. Cremastra (talk · contribs) 03:12, 4 December 2025 (UTC)[reply]

Change to: "Do not use an LLM to add ~~unreviewed~~ content" – I don't think this is going to shape up to be that kind of full-ban proposal (unlike #1 and #3 on this page are). That said, the core text as-is would be straightforward improvement while also posing no impediment to adopting more restrictions in the future. WP:NEWLLM was a small step, this would be a larger one, I'd suggest not letting perfect be the enemy of better. fifteen thousand two hundred twenty four (talk) 03:27, 4 December 2025 (UTC)[reply]

Thanks for all the comments. I have formally opened an RfC: User talk:Qcne/LLMGuideline#RfC: Replace text of Wikipedia:Writing articles with large language models. qcne (talk) 11:28, 4 December 2025 (UTC)[reply]

Further amendment proposal #3: Athanelar

Throwing my hat in the ring, essentially the same as Festucalex's proposal but just with slightly narrower scope that doesn't imply we're trying to police people using AI for idea generation or the likes.

Athanelar (talk) 15:17, 24 November 2025 (UTC)[reply]

This completely changes the purpose of this guideline (expanding its scope from new articles to all edits) and would require a new RfC. Toadspike [Talk] 15:48, 24 November 2025 (UTC)[reply]

That's sort of the intention, yes. I assume Festucalex is doing the same, and the intention is to gauge support before a formal RfC to expand the guideline. Athanelar (talk) 15:52, 24 November 2025 (UTC)[reply]

@Qcne and Athanelar: May I have your permission to change the headers from their present titles to this:

Further amendment proposal #1: Festucalex
Further amendment proposal #2: qcne
Further amendment proposal #3: Athanelar

Just to make it clearer to other editors? I'll also change the section link that Athanelar put above. 〜 Festucalex • talk 16:18, 24 November 2025 (UTC)[reply]

Of course, thank you. qcne (talk) 16:19, 24 November 2025 (UTC)[reply]

Go ahead, thanks. Athanelar (talk) 16:31, 24 November 2025 (UTC)[reply]

Done, thank you both. I took the liberty of adding an explanatory hatnote. 〜 Festucalex • talk 16:34, 24 November 2025 (UTC)[reply]

This is all to see if people support a new guideline as opposed to a proper change. GarethBaloney (talk) 16:37, 24 November 2025 (UTC)[reply]

I suggest dropping the Large language models (or LLMs) can be useful tools part. It's not necessary and will cause an awkward divide if taken to RfC where editors who more broadly oppose LLM use would have to endorse that they are useful tools. fifteen thousand two hundred twenty four (talk) 16:27, 24 November 2025 (UTC)[reply]

I've modified my wording somewhat. I agree that part is unnecessary. Athanelar (talk) 16:35, 24 November 2025 (UTC)[reply]

As I've discussed previously, personally I would prefer any guidance not to refer to specific technology, as this changes and is not always evident to those using tools written by others, and focus on purpose. Along the lines of my previous comment in the RfC, I suggest something like "Programs must not be used to generate text for inclusion in Wikipedia, where the text has content that goes beyond any human input used to trigger its creation." (Guidance for generated images is already covered by Wikipedia:Image use policy § AI-generated images.) isaacl (talk) 18:22, 24 November 2025 (UTC)[reply]

How would Text generation software such as large language models (LLMs) should not [...] sound? Athanelar (talk) 18:26, 24 November 2025 (UTC)[reply]

Personally, I prefer using a phrase such as "Programs must not be used to generate text" as I think it better reflects what many editors want: text written by a person, not a program. I think whether it's in a footnote or a clause, text generation should be defined, so using programs to help with copy-editing, or to fill in the blanks of a skeleton outline is still allowed. Also, I prefer "must" to "should". isaacl (talk) 19:16, 24 November 2025 (UTC)[reply]

"Programs" is too nonspecific I think; a word processor is arguably a "program used to generate text" for example. We need to be somewhat specific about what sort of technology we're forbidding here. Athanelar (talk) 19:30, 24 November 2025 (UTC)[reply]

Thus why I said the meaning of text generation should be defined, and as I suggested, the generated text should not have content that goes beyond any human input used to to trigger its creation. Accordingly, word processors do not fall within the definition. isaacl (talk) 23:44, 24 November 2025 (UTC)[reply]

Honestly, I like this as the lead for Qcne's proposal above. Specifying it's about both creating articles and editing existing ones is good clarity Kowal2701 (talk) 21:41, 24 November 2025 (UTC)[reply]

Oppose. I would argue that the current text is already too restrictive (yes, AI can be abused, but so does the WP:AWB) and needs to be handled in other way altogether (like the AWB is handled). Викидим (talk) 22:04, 24 November 2025 (UTC)[reply]

This proposal is more restrictive than proposal #2, so it can't serve as a lead for it. isaacl (talk) 23:50, 24 November 2025 (UTC)[reply]

Support. I'm still going to try making incremental changes to improve the current version, but this closes the biggest loophole (inserting content into existing articles) while eliminating "from scratch". You're going to need to tighten your definitions though or "but it's only one sentence and I reviewed it". ~ Argenti Aertheri^(Chat?) 21:13, 26 November 2025 (UTC)[reply]

How would you know whether one sentence was AI-generated? Is it practical to prohibit an undetectable use? Unenforceable "laws" can lead to a general disregard for rules ("Oh, yes, driving that fast is illegal here, but everybody does it, and the police don't care" becomes "Nobody cares about speeding, and reckless driving is basically the same thing"). WhatamIdoing (talk) 06:28, 27 November 2025 (UTC)[reply]

Is it practical to prohibit an undetectable use? – Banning all use bans all use. All vandalism is prohibited, not just detectable vandalism, same for NPOV violations, promotion, undisclosed paid editing, sockpuppetry, etc. What can be detected will be, what can not will not. I do not understand your point. fifteen thousand two hundred twenty four (talk) 06:43, 27 November 2025 (UTC)[reply]

Yes, banning bans all use. But if you can't tell whether the use happened, or prove that it didn't, then we might end up with drama instead of an LLM-free wiki. WhatamIdoing (talk) 02:20, 28 November 2025 (UTC)[reply]

We can't prove COI or undisclosed paid editing either, we still don't allow them. ~ Argenti Aertheri^(Chat?) 19:39, 28 November 2025 (UTC)[reply]

And we end up with drama about that regularly, when an editor issues an accusation, and the targeted editor denies it, and how do you prove who's correct? WhatamIdoing (talk) 02:47, 2 December 2025 (UTC)[reply]

Since that's all par the course for COI, I think you may have misunderstood my !vote. I'm sorry if it sounded like I was trying to say one reviewed sentence should (not) be allowed. I meant to say: this will come up if this goes for RfC, so address it before RfC. Personally I think one reasonable length sentence is my comfort level, if only because of how much GPTs like to ramble. ~ Argenti Aertheri^(Chat?) 18:31, 2 December 2025 (UTC)[reply]

Oppose. Instead of this approach, which I do not think would make for a useful guideline, I support adopting WP:LLMCIR as a guideline.—Alalch E. 00:04, 28 November 2025 (UTC)[reply]
Support. AI causes wikipedia numerous issues like hallucinations text that does not make sense and unsourced content etc. I believe the guidline prohibiting the use of ai to generate article content is the best way forward. GothicGolem29 (Talk) 18:58, 28 November 2025 (UTC)[reply]
Oppose. LLMs are useful tools when used carefully. Anne drew (talk · contribs) 19:52, 3 December 2025 (UTC)[reply]

Expanding CSD G15 to align with this guideline

Those participating in this discussion might also be interested in my discussion about potentially expanding CSD G15 to apply to all AI-generated articles per this guideline. Athanelar (talk) 16:53, 24 November 2025 (UTC)[reply]

Discussion withdrawn within six hours by the OP due to opposition. WhatamIdoing (talk) 06:29, 27 November 2025 (UTC)[reply]

Not a proposal, just some stray ideas

I didn't participate in the original RfC and I haven't fully read the new proposals and discussions here, but I'll table the rough notes I've been compiling at User:ClaudineChionh/Guides/New editors and AI in case there are any useful ideas there. (There might be nothing useful there; I'm still slowly working my way through the discussions on this page.) ClaudineChionh (she/her · talk · email · global) 23:04, 24 November 2025 (UTC)[reply]

After reflecting on the common refrain in these discussions that AI is just a tool, we should judge LLM text by the same standards we judge human text, I also finally put some of my thoughts on this matter into essay form (complete with clickbaity title!) at User:LWG/10 Wikipedia Policies, Guidelines, and Expectations That Your ChatBot Use Probably Violates. There's also a little "spot the LLM" easter egg if anyone wants a small diversion. -- LWG ^talk 03:03, 25 November 2025 (UTC)[reply]

Further amendment proposal #4: Mikeycdiamond

During the initial discussion of this guideline, I noticed that people were complaining that others would use it to blanketly attack stuff at XFD because it might be by an AI. My proposal would fix that problem. I also noticed some slight overlap with the third sentence of my proposal and Qcne's proposal, but I would appreciate input on whether I should delete it. If my proposal were to be enacted, I believe it should be its own paragraph.

"When nominating an AI article for deletion, don't just point at it and say, "That's AI!" Please point out the policies or guidelines that the AI-generated article violated. WP:HOAX and WP:NPOV are examples of policies and guidelines that AIs commonly violate." Mikeycdiamond (talk) 00:55, 25 November 2025 (UTC)[reply]

Oppose. I would compare the situation to WP:BURDEN - deleting AI slop should be easy at the slightest suspicion, keeping it should require disclosures / proofs of veracity, etc. (like BURDEN does in the case of unsourced text). This proposal goes in the opposite direction: another editor should be able to tell me that "this article looks like AI slop. Explain to me how you created this text", in the same way they can point to BURDEN and tell me "show me your sources or this paragraph will be gone". Викидим (talk) 01:17, 25 November 2025 (UTC)[reply]

@Викидим, I have "the slightest suspicion" that the new articles you created at Attribute (art) and Christoph Ehrlich used AI tools. Exactly how easy should it be for me to get your new articles deleted? WhatamIdoing (talk) 06:35, 27 November 2025 (UTC)[reply]

The key word in my remark is "slop". I do not think that everything that AI produces is sloppy. Incidentally, I already provide full disclosures on the talk pages. I hope this would convince other editors in the veracity of the article content, so the hypothetical AfD would not happen. So, (1) I firmly believe that using AI should be allowed and (2) acknowledge the need to restrict the cost of absorbing the AI-generated text into the encyclopedia.

My personal preference would be to have a special "generative AI" flag that allows the editor to use generative AI. For some reason this idea is not popular. An alternative would be to shift the burden onto of proof of quality onto the users of generative AI. For an article showing the telltale signs of AI use, absence of published prompts or prompts indicating that the AI was involved in the search for RS can be grounds for deletion IMHO. Викидим (talk) 06:58, 27 November 2025 (UTC)[reply]

I think some editors believe "AI slop" is redundant (i.e., all generative AI is automatically slop), so your articles would be at risk of AFD.

Other editors believe that "deleting slop should be easy", even if it's not AI-related. WhatamIdoing (talk) 02:22, 28 November 2025 (UTC)[reply]

Regarding the quality of AI output: based on what I have witnessed firsthand, the modern AI models, when properly used, can provide correct software code of quite non-trivial size. I will happily admit that the uncertainties inherent in any human language make operations with it harder than than with programming languages, but the fact that AI (as of late 2025) in principle can generate demonstrably correct text is undeniable. Same thing apparently happens when AI is asked to produce, say, a summary of facts relating to X from a few-hundred-page book that references back to the pages in the original book. Here, based on personal experience, I am yet to encounter major issues, too. Writing of a Wikipedia article is very close to this latter job, so I see no reason why modern AI, properly prompted, should produce slop. Unlike in the former case, where the proof of correctness is definite, I can be wrong, and will happily acknowledge it if somebody provides me with an example of, say, Gemini 3.0 summarizing text on a "soft" topic wildly incorrectly after adequate prompts (which in this case are simple: "here is the file with text X, create summary of what it says about Y for use in an English Wikipedia article"). Викидим (talk) 04:39, 28 November 2025 (UTC)[reply]

Even if you think that modern AI can produce good content, other editors appear to be dead-set against it.

Additionally, you are opposing a request for editors to say more than "That's AI" when trying to get something deleted. Surely you at least mean for them to say "That's AI slop"? Because if "modern AI, properly prompted" is a reason for deletion, then your AI-generated articles will disappear soon. WhatamIdoing (talk) 02:50, 2 December 2025 (UTC)[reply]

I understand the internal contradiction in my posture. I stems from the fact that I look at AI from two angles, as an editor who actually likes to create articles using AI and feels good about the need to wash hands prior to cooking the text, and as an WP:NPP member where I occasionally face the slop. Викидим (talk) 06:46, 2 December 2025 (UTC)[reply]

My experience has been the opposite -- AI-generated text in my experience tends to represent sources so poorly that when I spot check some obviously-modern-AI text, there is a >50% chance that it's going to be the same old slop just with a citation tacked on.

Recent and characteristic example: Talk:Burn (Papa Roach song), generated a few days ago most likely with ChatGPT (based on utm_source params in the editor's other contributions). I don't know what LLM or prompt was used, but it took me only ~10 minutes to find several instances of AI-generated claims that sources say things that they simply don't. This isn't an especially noteworthy example either, it got it wrong in the exact same ways it usually does.

And if the article were to go to AfD -- note, I am not saying that it should -- that is actually relevant, because the AI text is presenting one source as multiple, and in one case inventing fictitious WP:SIGCOV literally just from a song's inclusion in a tracklisting. This becomes obvious when you read the cited sources, but many at AfD don't. Gnomingstuff (talk) 20:55, 2 December 2025 (UTC)[reply]

Oppose in its current form. Generally I think AI usage falls under WP:NOTCLEANUP -- a lot of AI-generated articles are about notable subjects, especially the ones where there's a language gap. But I do think there are legitimate reasons to bring AI usage up at AfD, because AI can misrepresent sources, and in particular often misrepresents them by making a huge deal out of a passing mention, making coverage seem significant that actually isn't. I also think that for certain topics -- POV forks, BLPs, etc. -- AI generation is a legitimate reason to just delete the thing. Gnomingstuff (talk) 01:23, 25 November 2025 (UTC)[reply]

Support. Explaining how WP:AfD is not cleanup is very important to clarifying the scope of this guideline Katzrockso (talk) 01:31, 25 November 2025 (UTC)[reply]

Promote WP:LLM to guideline We cite it and treat it as if it were a guideline and not an essay. For Pete's sake, just promote it already! It has everything necessary for a comprehensive LLM usage guideline. SuperPianoMan9167 (talk) 02:01, 25 November 2025 (UTC)[reply]

We've already gone through a month-long RFC to promote this to a guideline. Could you image how large the debate would be if we tried to promote that essay? It might be quicker to work on this guideline. Mikeycdiamond (talk) 02:05, 25 November 2025 (UTC)[reply]

That essay is comprehensive and well-written. In my opinion, it would be quicker to just promote it to guideline instead. Besides, it already contains guidance in the spirit of this guideline in the form of WP:LLMWRITE. It also contains WP:LLMDISCLOSE, which I think should be policy (and I am honestly baffled that it isn't). SuperPianoMan9167 (talk) 02:09, 25 November 2025 (UTC)[reply]

No one is stopping you from making an RFC. I don't disagree with you, but I am not sure if it would pass. Mikeycdiamond (talk) 02:12, 25 November 2025 (UTC)[reply]

I was looking through LLM's talk page archives; there was an RFC in 2023. The RFC showed large consensus against promoting it, but a lot has changed since then. Mikeycdiamond (talk) 02:22, 25 November 2025 (UTC)[reply]

Oppose; misses the point of NEWLLM, which is specifically to forbid AI-generated articles simply because they are AI-generated, and not because of AI-related policy violation. Athanelar (talk) 02:56, 25 November 2025 (UTC)[reply]

That's your interpretation of the guideline. Other editors will interpret it in different ways. SuperPianoMan9167 (talk) 02:57, 25 November 2025 (UTC)[reply]

The text of the guideline is pretty clear on what it forbids. It says that LLMs are not good at generating articles, and should not be used to generate articles from scratch. We can argue all day about what 'from scratch' means (which is what these amendment proposals are meant to solve) but the fact that the guideline forbids AI writing in itself is not I think ambiguous in any sense; there is no room in the proposal to argue that it's saying AI-generated articles are only bad if they violate other policies. Athanelar (talk) 03:06, 25 November 2025 (UTC)[reply]

If they don't violate other policies/guidelines, what is the point of deleting them? Isn't the sole reason of banning AIs because they violate our other policies/guidelines? Mikeycdiamond (talk) 03:11, 25 November 2025 (UTC)[reply]

Because they violate this guideline, which says you shouldn't generate articles using AI. Athanelar (talk) 03:15, 25 November 2025 (UTC)[reply]

WP:IMPERFECT and WP:ATD-E are core Wikipedia policies that collectively suggest WP:SURMOUNTABLE problems that can be resolved with editing should not be deleted. Katzrockso (talk) 03:45, 25 November 2025 (UTC)[reply]

In my eyes, a guideline which says "Articles should not be generated from scratch using an LLM" logically means the same thing as "An article generated from scratch using an LLM should not exist." It would be kind of odd to me to argue that this guideline doesn't support deletion; because what, you're saying that you shouldn't generate articles using AI, but if you happen to do so, then it's fine as long as it doesn't violate other policies/guidelines? That would mean that this guideline really does nothing at all.

And anyway, your argument also arguably applies to an AI-generated article which violates other policies/guidelines, too. I mean, those problems might also be surmountable, so what's the problem there? Should we disregard CSD G15 and say that unreviewed AI-generated articles are fine as long as the article subject is notable and the article is theoretically fixable with human intervention?

Basically, I think adding a paragraph to this guideline saying that you can't use it to support deletion would mean there's no point in this guideline existing at all, and you might as well just propose that the guideline be demoted again. Athanelar (talk) 03:57, 25 November 2025 (UTC)[reply]

Say Mary Jane generates an LLM-written article that has some major, but surmountable, issues. For example, two of her citations are to fake links, but other sources are readily available to support the claims, three of the claims are improperly in wikivoice when they should be attributed, and there is a section of the article that is irrelevant/undue. Would you suggest this article be deleted in whole, despite being otherwise a notable topic, or should editors be allowed to remedy the problems generated by the LLM usage? Katzrockso (talk) 04:04, 25 November 2025 (UTC)[reply]

I think in the given example it would essentially be the same amount of effort to TNT the article and start from scratch as to try to rework it from the flawed foundation; so yes, I'd say deletion would still be fine in that case.

Besides, what exactly would we be fighting to keep in the other case? It's not as if we'd be doing so out of a desire to respect Mary Jane's effort in creating the article. We'd be trying to hammer a square peg into a round hole for no reason other than 'well, the subject's notable and the article's here now, so...' Athanelar (talk) 04:11, 25 November 2025 (UTC)[reply]

It's my (and my other editors) belief that TNT is not a policy-based solution remedy (WP:TNTTNT), but one that violates fundamental Wikipedia PAG. In my given example, I don't see how "it would essentially be the same amount of effort to TNT the article and start from scratch as to try to rework it from the flawed foundation". The remedy in my scenario would be:

1) Replace the fake link citations to the readily available real sources that support the claim

2) Change the three sentences that are improperly in wikivoice to attributed claims

3) Remove the off-topic/irrelevant section

If you think that is more difficult than starting from scratch, I don't know what to express other than shock and disbelief. Katzrockso (talk) 06:02, 25 November 2025 (UTC)[reply]

About TNT: Has it ever occurred to you that the actual admin delete button isn't necessary? You can follow process you're thinking of (AFD, red link, start new article) or you could open the article, blank the contents, and replace it with the new article right there, without needing to spend time at AFD or anything else first. WhatamIdoing (talk) 06:37, 27 November 2025 (UTC)[reply]

(also, the article you've given as your example here would already be suitable for deletion under CSD G15 whether or not WP:NEWLLM existed, so if you don't think that article would be suitable for deletion, you're also arguing we shouldn't have CSD G15) Athanelar (talk) 04:13, 25 November 2025 (UTC)[reply]

Things are only as good as the parts that make them up. If it wasn't for HOAX or NPOV--among many other-- violations, this guideline wouldn't exist. We already have policies and guidelines for the subjects AIs violate; why shouldn't we use them? It is much clearer to point out the specific thing the text violates then blindly saying it is AI. I know AI text is relatively easy to spot now, but it will get progressively better at hiding from detection. What if people use anti-AI detection software? This guideline is meant to back up stronger claims using other policies/guidelines, not be the sole argument in an XFD. Mikeycdiamond (talk) 03:09, 25 November 2025 (UTC)[reply]

The text of this guideline literally says 'LLMs should not be used to generate articles from scratch.' Your proposed amendment to that guideline is to tell people that when deleting AI-generated articles, they cannot reference the guideline that specifically says 'Don't generate articles with AI' and must instead reference other policies/guidelines that the article violates.

That would seem to defeat the whole point of passing a guideline that says 'Don't generate articles with AI,' wouldn't it? Athanelar (talk) 03:14, 25 November 2025 (UTC)[reply]

Deletion policy wasn't really discussed all too much in the RfC or the nonexistent RFCBEFORE, so whether it defeats the purpose is not established. Many editors expressed positive attitudes towards the guideline because it provided somewhere to point to explain to people why their LLM contributions aren't beneficial. Katzrockso (talk) 03:47, 25 November 2025 (UTC)[reply]

Oppose as defeating the purpose of having a guideline. We just passed a guideline saying "don't create articles with LLMs", this would effectively negate that by turning around and saying "actually, it's fine if it doesn't violate anything else". It doesn't work that way with any other guideline and for good reason: imagine nominating something for deletion due to serious COI issues and being told "nah, prove it violates NPOV". No, the burden of proof is on the editor with the conflict because they're already violating one guideline. This is one guideline, violating one guideline is enough. ~ Argenti Aertheri^(Chat?) 21:27, 25 November 2025 (UTC)[reply]

I agree completely with the objections raised by Викидим and Gnomingstuff and Athanelar and Argenti Aertheri. AFD is about what an article is lacking (sourcing establishing notability), not about what bad content it has - just remove the bad content and AFD whatever is left if warranted. So there is no reason to treat NEWLLM differently from any other guideline there. -- LWG ^talk 01:10, 26 November 2025 (UTC)[reply]

Oppose — This reminds me of when people tried to undercut the ban on AI slop images as soon as it passed. The guideline needs to made stronger, not weaker. —pythoncoder (talk | contribs) 15:39, 26 November 2025 (UTC)[reply]

Oppose per all above. A guideline is a guideline and a statement of principle, and should be used directly, not as through proxies. If there is overwhelming evidence an article is wholly AI-generated such that it falls afoul of this guideline, the article should be deleted at AfD. Cremastra (talk · contribs) 19:01, 26 November 2025 (UTC)[reply]

Oppose. Not topical in this guideline as this guideline is not about deletion in the first place.—Alalch E. 23:52, 27 November 2025 (UTC)[reply]

Some people think it is, see #Expanding CSD G15 to align with this guideline. SuperPianoMan9167 (talk) 00:04, 28 November 2025 (UTC)[reply]

community consensus on how to identify LLM-generated writing

Not sure how I feel about this one.

On the one hand, there is some research suggesting that consensus helps: specifically, when multiple people familiar with signs of AI writing agree on whether a given piece of text is AI, they can achieve up to 99% accuracy. Individual editors were topping out at around 90% accuracy (which is still very good obviously).

On the other hand, we have to treat an edit as human-generated until there's consensus otherwise seems like a massive restriction that came out of nowhere -- it doesn't have consensus in the RfC and I'm not sure more than a handful of people even said anything close. Like, just think about how that would work in practice. Do we have to commune a whole AI Tribunal before reverting text that is very clearly AI-generated? Is individual informed judgment not enough?

This stuff is really not hard to identify. WP:AISIGNS exists, and is relatively up to date with existing research on common characteristics of LLM-generated text -- and specifically, things it does that text prior to ~2022 just... didn't do very often. This is also the case with Wikipedia text prior to mid-2022. I've been running similar if lax text crunching on Wikipedia articles before mid-2022, and the same tells have just skyrocketed. The problem is actually convincing people of this: that AI text consistently displays various patterns far more often than human text does (or for that matter, than LLM base models do), that people have actually studied those patterns, and that the individual edit they are looking at fits the pattern almost exactly. Is the page just not clear enough? Does it need additional citations? Gnomingstuff (talk) 01:11, 25 November 2025 (UTC)[reply]

I think this caveat was added to the RfC only because the closer didn't believe there was enough consensus for the promotion to guideline, and adding the requirement for consensus to determine that an article is in fact AI generated helps to soothe those who think the guideline is over-restrictive.

I also think it's really a non-issue; since there's no support currently to expand CSD G15 to apply to all AI-generated articles, any article suspected of being AI-generated in violation of NEWLLM will have to go to AfD anyway, which automatically will end up determining consensus about whether the article is AI generated and should be deleted under NEWLLM. Athanelar (talk) 03:02, 25 November 2025 (UTC)[reply]

we have to treat an edit as human-generated until there's consensus otherwise Where did this come from?

As for your number crunching, I'm not sure if I understand the results, but if we are going to start taking phrases like "pivotal role in" and "significant contributions to" as evidence of LLM contributions, then I think this starts to pose problems. Katzrockso (talk) 03:03, 25 November 2025 (UTC)[reply]

It's from the RFC closing note. Athanelar (talk) 03:07, 25 November 2025 (UTC)[reply]

That sentence in the closing note is strange to me as well, and only makes sense in the context of an AFD or community sanctions on a problem user. In terms of reversion/restoration of individual suspected LLM-edits, the WP:BURDEN is clearly on the user who added the content to explain and justify the addition, not on a reverting editor to explain and justify their reversion. In the context of LLM use, that means that if someone asks an editor "did you use an LLM to generate this content, and if so what did that process look like?" they should get an clear and accurate answer, and if they don't get a clear and accurate answer the content should be removed until they do. -- LWG ^talk 03:22, 25 November 2025 (UTC)[reply]

I think ultimately it's just an effort by the closer to avoid 'taking a side' on what they perceived as a pretty tight consensus, and to preempt a controversy about the nature of the guideline; which of course is occurring anyway. Athanelar (talk) 03:32, 25 November 2025 (UTC)[reply]

No, it's not any of those things. It's me knowing this argument was going to be made and pre-empting it. Where there's no rule or guideline, Wikipedia makes content decisions by consensus; so an edit isn't to be treated as AI-generated until either we've got consensus for a test that it's AI-generated or else we've analysed the edit and reached consensus that it's AI-generated.

I know this limits the applicability of the guideline but that's not because I'm unclear or unsure about the RFC outcome or worried about taking sides. It's because of how long-established Wikipedia custom and practice works.

A test of what actually identifies AI-generated writing should really be the next step, folks.—S Marshall T/C 08:57, 25 November 2025 (UTC)[reply]

The issue is that requiring consensus before tagging content as problematic (instead of tagging the content and then following WP:BRD) imposes an unnecessary restriction, even on current practices, which wasn't brought up in the discussion. This close would mean, for example, that we can't tag a page as {{AI-generated}} anymore without first requiring an explicit consensus. This isn't Wikipedia custom and practice for tagging and has never been. Chaotic Enby (talk · contribs) 09:10, 25 November 2025 (UTC)[reply]

The best solution to these problems is to reach consensus on a test. But obviously, tagging doesn't need consensus and never has. What's not allowed is to revert or delete content for being AI-generated unless there's consensus to do so. Just to be clear: all our normal rules apply. You can still revert for all the usual reasons. BRD still applies. ONUS still applies. You can still tag stuff you suspect might be problematic.—S Marshall T/C 09:43, 25 November 2025 (UTC)[reply]

But obviously, tagging doesn't need consensus and never has. This is certainly not obvious from your close, which says that this means that we have to treat an edit as human-generated until there's consensus otherwise. A closure should only summarize the given discussion, not add new policies that need to rely on the word of the closer for later clarification, even if they would be a logical development from previous practice. Chaotic Enby (talk · contribs) 10:44, 25 November 2025 (UTC)[reply]

Summarize and clarify. A close should summarize the community's decision and clarify its relationship to existing policy and procedure. What we don't want looks like this: I think this user is adding AI-generated content so I'm going to quick-fail all their AfC submissions and then follow them round reverting and prodding.—S Marshall T/C 12:22, 25 November 2025 (UTC)[reply]

What's not allowed is to revert or delete content for being AI-generated unless there's consensus to do so --

I'm not aware of anything in policy stating this -- certainly not AI policy, because we don't have any. Based on the consensus of this RfC, and on the fact that people are already reverting and deleting content for being AI to relatively little outcry, I don't think there would be consensus for such a prohibition, and I think most people in the RfC would be surprised to learn they were !voting for one. Gnomingstuff (talk) 10:16, 26 November 2025 (UTC)[reply]

As far as a test being the next step... I mean I'm trying Jennifer. We have WP:AISIGNS and are trying to make it as research-backed as possible. It is an evolving document, and I'm sure most contributors to it have their own list of personal tells they've noticed. (For example I trust @Pythoncoder's judgment implicitly on detecting AI but they see stuff I have no idea about. Apologies if you don't want the ping, I figured the outcome here is relevant to you.) But there are several problems:

Problem 1: Getting people to actually believe that these are signs of AI use. There seems to be no amount of evidence that is enough.

Problem 2: Getting people to interpret things correctly. This stuff gets very in-the-weeds, and AISIGNS leaves out a lot for that reason. For instance, one "personal tell" I have noticed is that Additionally, starting a sentence with capitals and punctuation, is a strong indicator of possible AI use, but the word additionally as an infix isn't necessarily a sign. Other tells I have are still kind of in the oven until I can hammer out a version with as few false positives as possible, with as little potential for confusion.

Problem 3: We are doomed to remain in the world of evidence, not proof. It is impossible to prove whether AI was used in an edit unless you are the editor who made it. Since we have had AI text incoming since 2023, many of those editors aren't around anymore. Other editors are not forthcoming with the information. Some dodge the question, some trickle-truth it, small handful of editors lie. Gnomingstuff (talk) 10:34, 26 November 2025 (UTC)[reply]

This is exactly the shit I mean. When:

A word is identified in multiple academic studies as very over-represented in LLM-generated text compared to human text
The most obvious phrase containing that word is roughly 1605% more common in one admittedly less rigorous sample of AI-generated edits compared to human-generated -- a substantial portion of which are human-generated articles tagged as promotional

...then yes, it would seem to be empirical evidence? No one can prove how a user produced an edit besides that user, but when patterns start showing up that happen to be similar patterns to ones cited in external sources as characteristic of AI use, that is telling. Gnomingstuff (talk) 03:31, 25 November 2025 (UTC)[reply]

Empirical evidence of what? I have humanly generated both those phrases before (not on Wikipedia, I don't think, but elsewhere), are you going to suggest deleting my contributions on these types of grounds, because your model suggests that LLMs use these phrases at higher rates? Keep in mind that human language is changing as a result of LLMs ([5]), for better or worse. Katzrockso (talk) 03:53, 25 November 2025 (UTC)[reply]

Empirical evidence that these words and phrases appear more frequently in the aggregate of AI-generated text -- in this case, on Wikipedia -- compared to the aggregate of human-generated text on Wikipedia. They also tend to occur together, and occur in the same ways, in the same places in sentences, the same forms, etc. So if an edit shows up with a whole bunch of this crammed into 500 words, that's a very strong indication that the text is probably AI. Not a perfect indication -- for instance, this version of Julia's Kitchen Wisdom is way too early for AI but sounds just like it -- but a very strong one.

I am aware of the studies that human language is changing as a result of LLMs -- one study suggests that this particular set of words is really just a supercharge to increases in those words that were naturally happening already. That particular study is less convincing because it seems to think podcasts are never scripted or pre-written, which is... not true. But anecdotally I do see it happening. (It's a bit weird to hear this stuff out of human mouths in the wild, although that's probably just the frequency illusion given how much AI text I am seeing all day.) Not sure how much that affects Wikipedia, especially the last few years of AI stuff to deal with, given that the changes in human language feel like a lagging indicator. Gnomingstuff (talk) 10:08, 26 November 2025 (UTC)[reply]

Incidentally GPTZero scans that revision of Julia's Kitchen Wisdom as 98% human, ~~highlighting the pivotal role of~~ illustrating the benefit of using multiple channels of evidence to assess content. -- LWG ^talk 17:47, 26 November 2025 (UTC)[reply]

I have the opposite reaction to Individual editors were topping out at around 90% accuracy (which is still very good obviously): I look at that and say even the best of the best were making false accusations at least 10% of the time.

Imagine the uproar if someone wanted to work in Wikipedia:Copyright problems, but they made false accusations of copyvios 10% of the time. We would not be talking about how good they are.

If anything, this information has convinced me that unilateral declarations of improper LLM use should be discouraged. Maybe tags such as Template:AI-generated should be re-written to suggest something like "This article needs to be checked for suspected AI use". WhatamIdoing (talk) 07:01, 27 November 2025 (UTC)[reply]

The template already says the article may contain them. There is a separate parameter, certain=y, that is added for cases where the AI use is unambiguous. Gnomingstuff (talk) 04:21, 28 November 2025 (UTC)[reply]

There does not need to be a community consensus on how to identify LLM-generated writing. It's a technical question. Different editors will apply different methods. Disputes will be resolved in the normal way. —Alalch E. 23:49, 27 November 2025 (UTC)[reply]

Tell that to the closing admin who specifically said in the RfC close In particular we need community consensus on (a) How to identify LLM-generated writing [...] Athanelar (talk) 00:13, 28 November 2025 (UTC)[reply]

That statement is true because most signs of AI writing, except for the limited criteria of G15, are largely subjective. SuperPianoMan9167 (talk) 00:17, 28 November 2025 (UTC)[reply]

A closer does not need to be an admin and the closer wasn't in this case. GothicGolem29 (Talk) 18:48, 28 November 2025 (UTC)[reply]

Further amendment proposal #5: Argenti Aertheri

We barely got the thing passed, so I propose we make small, incremental, changes. Changing LLMs to all AI seems as good a place to start as any other, and probably less controversial than some. ~ Argenti Aertheri^(Chat?) 03:53, 25 November 2025 (UTC)[reply]

Oppose. One of the primary criticisms the first amendment proposals were trying to address was the prominent criticism during RfC that the term 'from scratch' has no agreed-upon definition and thus the scope of which articles this guideline applies to isn't clearly defined; your proposal doesn't address that, and in the process introduces a whole host of new ambiguity as to what tools are and aren't allowed, and in what capacity one might be allowed to use them. Athanelar (talk) 04:00, 25 November 2025 (UTC)[reply]

There's a definition at wikt:from scratch. Merriam-Webster offers a similar definition.

There were 37 uses of "from scratch" in the RFC; most of them were entirely favorable. There were 117 editors in the discussion; I see four who complained about the "from scratch" wording, and some of them (example) would still be valid no matter what words were used. WhatamIdoing (talk) 07:10, 27 November 2025 (UTC)[reply]

GPT is a type of LLM, not something that can be contrasted with it. What other forms of "artificial intelligence" (a dubious + nebulous concept) are creating Wikipedia articles other than LLMs? Katzrockso (talk) 04:00, 25 November 2025 (UTC)[reply]

The point isn't to address all the problems in the guideline that passed, just one: what technologies does this include. I know AI is a nebulous concept, that's actually why I chose it, so that WP:Randy from Boise can tell in seconds if his use of his software is included. Porn is a nebulous concept too, but we all know it when we see it. ~ Argenti Aertheri^(Chat?) 04:20, 25 November 2025 (UTC)[reply]

What is not covered by the existing guidelines that your change would include? Katzrockso (talk) 05:56, 25 November 2025 (UTC)[reply]

1) Remove the unnecessary "can be useful tools", it's not relevant here.

2) Replace the technical term "LLM" with a more readily accessible definition that clarifies that we want human intelligence, not artificial intelligence, regardless of the exact technology being used. Ergo explicitly stating GPTs despite them being a subset of LLMs, people know what a GPT is and if they're using one. ~ Argenti Aertheri^(Chat?) 06:33, 25 November 2025 (UTC)[reply]

The "can be useful tools" part was just implemented as a part of the RfC on the two-sentence guideline, removing half of the approved text from the RfC is not a good start.

"clarifies that we want human intelligence, not artificial intelligence" makes no sense, is less clear than the current version and if anything muddies the scope and applicability of this guideline. Katzrockso (talk) 09:34, 25 November 2025 (UTC)[reply]

Would you find it acceptable to change the current wording from "LLMs" to "LLMs, including GPTs" if no other changes were made? ~ Argenti Aertheri^(Chat?) 19:02, 25 November 2025 (UTC)[reply]

I would find it acceptable/unobjectionable, I just think it's superfluous Katzrockso (talk) 00:14, 26 November 2025 (UTC)[reply]

It's redundant if you know that GPTs are LLMs, but not if you're just Randy from Boise asking ChatGPT about the Peloponnesian War. Randy would likely have an easier time understanding the guideline with that explicitly spelled out. ~ Argenti Aertheri^(Chat?) 01:35, 26 November 2025 (UTC)[reply]

Maybe a footnote like the one in WP:G15 would work, which says The technology behind AI chatbots such as ChatGPT and Google Gemini. SuperPianoMan9167 (talk) 02:07, 26 November 2025 (UTC)[reply]

Works for me, hopefully it works for Randy too. Should I reword this proposal or WP:BRD? ~ Argenti Aertheri^(Chat?) 07:22, 26 November 2025 (UTC)[reply]

I went ahead and added the footnote. SuperPianoMan9167 (talk) 22:47, 26 November 2025 (UTC)[reply]

This is much clearer/explanatory than the term "GPTs" or "artificial intelligence". Support this change Katzrockso (talk) 07:47, 27 November 2025 (UTC)[reply]

I think that @Katzrockso and @Argenti Aertheri make a good point, and it's one that could be solved by making a list. Imagine something that says "This bans article creation with AI-based tools such as ChatGPT, Gemini, and that paragraph at the top of Google search results. This does not ban the use of AI-using tools such as Grammarly, the AI grammar tools inside Google Docs, or spellcheck tools."

These lists don't need to be in this guideline, but it might help if they were long. It should be possible to get a list of the notable AI tools in Template:Artificial intelligence navbox. WhatamIdoing (talk) 07:17, 27 November 2025 (UTC)[reply]

So this begs the question why is Grammarly spell check allowed but not ChatGPT spellchecking? I'm not saying that people should plop "Write me a Wikipedia article" into a LLM and paste that into Wikipedia, but these LLMs have other use cases too. What use cases people want to prohibit/permit really need to be laid out more explicitly for this to be workable. Katzrockso (talk) 07:46, 27 November 2025 (UTC)[reply]

Here (as someone who admittedly has not used Grammarly since their adoption of LLM tech) it would (potentially) be that Grammarly uses a narrow and specific LLM model that has additional guardrails that prevent it from acting in the generative manner that ChatGPT does. Or at least that would have been the smart way of rolling out LLM tech for Grammarly, as said I've not used it so I don't know where they have implemented rails. -- Cdjp1 (talk) 16:54, 27 November 2025 (UTC)[reply]

In my experience reading Grammarly-edited text, it doesn't always use those guardrails well. It also tends to push a lot of more expansive AI features on people. Gnomingstuff (talk) 17:07, 29 November 2025 (UTC)[reply]

In re this begs the question why is Grammarly spell check allowed but not ChatGPT spellchecking? Yes, well, that is a question, isn't it? And I think it's a question that editors won't be able to answer if they don't realize that ChatGPT can do spellchecking.

https://arxiv.org/html/2501.15654v2 (which someone linked above) gave 300 articles to a bunch of humans, and asked them to decide whether each article was AI-generated or human-written. They learned that an individual who doesn't use LLM incorrectly missed 43% of the LLM-written articles and falsely accused 52% of the human-written articles as being LLMs. This is in the range of a coin-flip; it is almost random chance.

I'm reminded of this because those non-users (e.g., me) are also going to be unaware of the various features or tools in the LLMs. A list might inform people of what's available, and therefore let us use a bit more common sense when we say "This tool is acceptable for checking your spelling, but that tool is prohibited." WhatamIdoing (talk) 02:30, 28 November 2025 (UTC)[reply]

It's spellcheck, no one cares how you figure out how to spell a word as long as you knew which word you were trying to spell. I'd be wary of grammarly unless they put guardrails as Cdjp1 suggests though, and if they have guardrails then that's what needs to be specified: which built-in guardrails make it ok? ~ Argenti Aertheri^(Chat?) 04:50, 28 November 2025 (UTC)[reply]

Nobody should care how you figure out how to spell a word, but it sounds like some editors aren't operating with that level of nuance. WhatamIdoing (talk) 02:52, 2 December 2025 (UTC)[reply]

LLMs can't do spellchecking in the sense we are used to. They can do something that can be similar in output, but the underlying process used won't be the same, due to the fundamental way llms work. In terms of tools, any llm-use will have this underlying generative framework because everything is converted into mathematics and then reconverted in some way. As Cdjp1 and Gnomingstuff note, refining any llm-use is about building the right guardrails, but these don't change the way the underlying program works. The complication with Grammarly is that it has its original software and new llm-based tools, and I'm not sure how much control or even knowledge the user has. Same possibly with Microsoft these days. CMD (talk) 07:24, 2 December 2025 (UTC)[reply]

In a couple of years, will the average person realistically have a way to use ordinary word processing software (e.g., MS Word or Google Docs) without an LLM being used somewhere in the background? I don't know. Maybe it just looks inevitable because of where we are in the Gartner hype cycle right now, but the inadvertent use of LLMs feels like it will only get bigger over time. WhatamIdoing (talk) 19:59, 4 December 2025 (UTC)[reply]

Since copying over the footnote seems pretty non-controversial, version 2:

While true, it's not relevant and only makes this mess messier. If it's a guideline about content creation then it doesn't really matter how well LLMs can do other tasks. ~ Argenti Aertheri^(Chat?) — Preceding undated comment added an unspecified datestamp.

Since you didn't get any direct replies to this, here's a late comment:

We're trying to present this as a guideline that involved reasonable people making a reasonable choice about reasonable things, rather than a bunch of ill-informed AI haters. The guideline is less likely to seem unreasonable or to be challenged by pro-AI folks if it acknowledges reality before taking away their tools. Therefore the guideline acknowledges and agrees with their POV ("can be useful"), names the community's concern ("not good at creating entirely new Wikipedia articles"), and then states the rule ("should not be used to generate new Wikipedia articles from scratch"). WhatamIdoing (talk) 20:07, 4 December 2025 (UTC)[reply]

Agreed. The rules are principles, not lists of things that editors should and should not do. SuperPianoMan9167 (talk) 20:10, 4 December 2025 (UTC)[reply]

Agreed. Alaexis_¿question? 21:04, 5 December 2025 (UTC)[reply]

Supplemental essay proposal on identifying AI-generated text

Seeing as it has been noted (particularly by the RfC closer) that the existence of a guideline which prohibits AI-generated articles necessitates the existence of a consensus standard on identifying AI-generated articles, I've drafted a proposal which aims to codify ways that AI text can be identified for the purpose of enforcing this guideline (and any other future AI-restricting guideline)

The essay content is largely redundant to WP:AISIGNS but rather than just a list of AI indicators it specifically aims to be a standard by which content can be labelled as AI-generated.

Your feedback and proposed changes/additions are most welcome at User:Athanelar/Identifying AI-generated text. If reception is positive I will submit an RFC.

Pinging some editors who were active in this discussion: @Qcne @Voorts @Gnomingstuff @Festucalex @Mikeycdiamond @Argenti Aertheri @LWG Athanelar (talk) 17:55, 26 November 2025 (UTC)[reply]

I agree a consensus standard is implied, but I would guess any rate of false positives or negatives will render either a guideline or tools controversial. I have a few suggestions: 1) I prefer a 'weak' or humble standard, using various criteria or methods may suggest but not prove AI use. 2) Checking the volume of changes, either as a single submission or in terms of bytes/second from a given IP or account, may occasionally serve as a cheaper semi-accurate proxy for AI detection, although once again there will be false positives and negatives. 3) Given the rapid development and diversity of AI tools, and the resources involved, I do not think developing uncontroversial tools for AI detection is a feasible goal in the near future. Deploying automatic tools sitewide or on-demand would likely be prohibited by cost, but if individual users wish to run them, I think their findings could contribute evidence towards a finding - so long as we guard against bias and overconfidence in the use of these tools. --Edwin Herdman (talk) 19:32, 26 November 2025 (UTC)[reply]

The "suggest" wording is a good idea. For those who worry it may not be workable, our entire concept of notability rests on similar wording (e.g. "presumed to be suitable", "typically presumed to be notable"). If we're going down this road, I'd support wording like this and judgement by consensus in case of dispute. Toadspike [Talk] 21:08, 26 November 2025 (UTC)[reply]

Regarding AI tools changing quickly, I did some very very very rough analysis of text pre- and post-GPT-5 if anyone is interested. Will revisit once I have more data. Gnomingstuff (talk) 03:57, 27 November 2025 (UTC)[reply]

I made one small tweak -- adding the bit about edits having to be post-2022 for AI use to even be possible. "Strongly suggest" is the best we can do, unfortunately. If the burden of proof is on the person tagging/identifying AI-generated text, then that is almost literally impossible to provide because no one knows how someone made an edit but that person.

As far as automated tools, you could do worse than just scraping all articles containing >5 instances (or whatever) of the listed "AI vocabulary" words, and then manually checking those to see what's up. (This is basically what I've been doing, minus the tools.) The elephant in the room, though, is that LLMs are changing right now -- GPT-5.1 came out just 2 weeks ago. We also almost never know which tools people are using, let alone the version or prompt or provided sources. And all that is compounded by the fact that even researchers don't know why AI sounds the way it does. The whole thing is largely a black box, and it's honestly kind of surprising we (as in we-the-public) have figured anything out at all. Gnomingstuff (talk) 00:11, 27 November 2025 (UTC)[reply]

Thanks for your tweak. I haven't had any adverse reaction to this essay yet, so I'll give it until the 24 hour mark and if nobody's raised any major objections I'll put it up for RfC, and providing that passes then we can link to my essay from the NEWLLM page and that'll at least solve one of the RfC close's two problems. Then it'll just be a matter of codifying what we do if something breaches NEWLLM; but people seem to be generally on board with 'send it to AfD' as a solution for that already.

My fingers are crossed we can move onto RfC for a proposal to expand NEWLLM to include all AI-generated contributions and not just new articles. Athanelar (talk) 00:16, 27 November 2025 (UTC)[reply]

This is redundant to WP:AISIGNS. Perhaps some content can be merged with AISIGNS. —Alalch E. 23:47, 27 November 2025 (UTC)[reply]

Note for everyone subscribed to this discussion; I have raised an RfC at the essay's talk page. Athanelar (talk) 00:20, 28 November 2025 (UTC)[reply]

A hypothetical scenario

Here's a hypothetical scenario to consider. Say you have an editor writing an article. It's a well-written, comprehensive article. They publish their draft and it gets approved at AfC and moved to mainspace. If that editor then says "I used AI to write the first draft of this article", does this guideline require the article be deleted, even though the content is perfectly acceptable? SuperPianoMan9167 (talk) 00:52, 27 November 2025 (UTC)[reply]

Personally I believe that if the article has been comprehensively rewritten and checked line by line for accuracy prior to asking other editors to spend time on it at AfC, the tools used for the initial draft don't matter. -- LWG ^talk 01:04, 27 November 2025 (UTC)[reply]

To me, "from scratch" implies a lack of rigorous review or corrections from a human editor. I attempted to clarify this in [6], but it got reverted. No reasonable person would require a perfectly-written and verified article to be deleted merely because an early draft was written with software assistance. Anne drew (talk · contribs) 01:05, 27 November 2025 (UTC)[reply]

It's possibly already happened, and certainly has been used for edits. One temporary account recently asked about it at the help desk. I wrote my questions 0 and 1 for this case. Reasons I think are good for disallowing it are: 1) We don't like the 'moral hazard' of letting a part of the process not have human input, and the larger the change without human input and oversight, the greater the potential problem. 2) Openly allowing AI use might cause human reviewers to be overwhelmed. 3) The copyright status of Wikipedia content could be challenged, especially if 'substantive' AI edits are allowed to stand, a concern I think may be decisive for Wikimedia Foundation and ArbCom given the potential for losses. I think a lot of the rest of it is similar to the risks we accept in ordinary editing - bias and errors may propagate for a long time, but we hope that eventually somebody spots the problem. --Edwin Herdman (talk) 02:34, 27 November 2025 (UTC)[reply]

It has absolutely already happened, to the tune of thousands of articles that we know about. And the ones we know about, we know about because there were enough signs in the text to be identifiable as AI. Gnomingstuff (talk) 02:35, 27 November 2025 (UTC)[reply]

@Edwin Herdman, I don't think I understand The copyright status of Wikipedia content could be challenged, especially if 'substantive' AI edits are allowed to stand, a concern I think may be decisive for Wikimedia Foundation and ArbCom given the potential for losses.

Does this mean that:

some of Wikipedia's contents will not be eligible for copyright protection? In that case, the WMF isn't going to care (they're willing to host public domain/CC-0 content, though they would prefer that it was properly labeled), and protecting editors' copyrights is none of ArbCom's business. (ArbCom cares about editors' behavior on wiki. They are not a general-purpose governance group.)
someone might (correctly) claim that they own the copyright for the AI-generated/AI-plagiarized contents of an article? In that case, the WMF will point them to the WP:DMCA process to have the material removed. If the copyright holder wishes to sue someone over this copyvio, they will need to sue the editor who posted it (not the WMF or ArbCom). This is in the foundation:Policy:Terms of Use; look for sentences like "Responsibility — You take responsibility for your edits (since we only host your content)" (emphasis in the original) and "You are responsible for your own actions: You are legally responsible for your edits and contributions" (ditto).

WhatamIdoing (talk) 05:48, 27 November 2025 (UTC)[reply]

I wrote that badly, but you've clarified the issue. I can't assume Wikipedia will always benefit from the Safe Harbor provision - the DMCA might be amended again or even repealed, or Wikipedia might be found to fail the Safe Harbor criteria. Even without a suit seeking damages, the DMCA process imposes at least some administrative burdens which I would consider worth a rough worst-case scenario estimate. I'll be happy if wrong; AI risks on copyright aren't totally unlike what any editor can do without AI, what's different is mainly spam potential and the changing legal landscape. My final thought is that LLMs don't inherently bring copyright issues - it's possible an LLM with a clear legal status might be developed. --Edwin Herdman (talk) 08:38, 27 November 2025 (UTC)[reply]

Based purely on the plain meaning of 'from scratch,' I would say that if the majority of the article's text is AI generated, then this guideline would suggest that the article should be deleted.

If a 'first draft' was written with AI and then substantially rewritten by a human, it would essentially be the same as doing it from scratch by the human, so it gets a pass.

'From scratch' to me implies you had nothing before, now you have an article. If that article was written with AI, then it falls afoul of this guideline. Athanelar (talk) 15:07, 27 November 2025 (UTC)[reply]

I would argue that there are actually two ways to parse how the “from scratch” guideline applies:

1. (as intended) You may not use an LLM to write a wholly new article that does not exist on Wikipedia as of yet.

2. You may not write an article by asking an LLM to generate it “from scratch”- ie without putting in any information. (Implied- you may use an LLM if you provide it with raw data)

In other words, it is entirely possible to read the “from scratch” clause as referring to the LLM generation process, and not the Wikipedia article process. ~2025-36891-99 (talk) 20:09, 27 November 2025 (UTC)[reply]

The answer is: No. To delete an article, it must be done in accordance with the wp:Deletion policy. —Alalch E. 23:37, 27 November 2025 (UTC)[reply]

IMO this misses the point. We don't set policy based on what it is possible, but based on the overall impact on the project. For example, I am sure there are users who could constructively edit within WP:PIA from their first edit, but we don't let them, because on average letting inexperienced users edit in that topic area was leading to huge problems. Same logic applies here. We need to set LLM policy based on overall impact to the project. NicheSports (talk) 23:57, 27 November 2025 (UTC)[reply]

We don't let new editors edit in the PIA topic area because ArbCom remedies are binding and cannot be overturned by fiat. This guideline is not like that. Reasonable exceptions should still be allowed. SuperPianoMan9167 (talk) 00:13, 28 November 2025 (UTC)[reply]

I was speaking more generally about how our LLM PAGs should develop in the future. This guideline is far from ideal and clearly is going to change. I don't know the right first step, I just know what I want it to get to. NicheSports (talk) 00:16, 28 November 2025 (UTC)[reply]

Is your ideal LLM guideline something like WP:LLM? SuperPianoMan9167 (talk) 00:20, 28 November 2025 (UTC)[reply]

WP:LLM covers a lot, so there are parts I'd probably agree with, but as it relates to usage of LLMs, no. My ideal policies would be

LLMs cannot be used to generate article prose or citations, regardless of the amount of review that is subsequently performed, unless the editor is experienced and possesses the llm-user right
Experienced editors could apply for the llm-user right, with the same requirements as autopatrolled
Users without the llm-user right could use LLMs for non prose-generating tasks. A few examples of this could be generating tables, doing proofreading, etc. We would need to draft an approved list of uses
I want to add a G15 criteria for machine-generated articles with multiple material verification failures. This would efficiently handle problematic LLM-generated articles
Content policy compliant LLM-generated articles would not need to be deleted. Although if they were discovered to be created by a user without the llm-user user right, we would warn the user about not doing so in the future.

NicheSports (talk) 00:38, 28 November 2025 (UTC)[reply]

So kinda like how AutoWikiBrowser (LLMs, like AWB, could be considered automated editing tools that assist a human editor) requires special approval? SuperPianoMan9167 (talk) 00:41, 28 November 2025 (UTC)[reply]

Yes, but with more restrictive criteria than AWB. I think the autopatrolled requirements are a nice fit (and kind of spiritually related) NicheSports (talk) 00:46, 28 November 2025 (UTC)[reply]

Please drop tables from the list of approved uses, it does it, and on face value seems to do it well, but under the hood is a different story. Maybe there's some version that does it well, or we could put guide rails on it, but GPTs format tables with overlapping column and row spans that are barely human readable. They're great with templates in general though if you check they haven't done more than copy and paste. "Put this text in this template following these rules" usually works beautifully, but not tables, the wiki table formatting is just too weird I guess. ~ Argenti Aertheri^(Chat?) 02:10, 28 November 2025 (UTC)[reply]

This is a very nice proposal, reflecting both the current situation (AI is simply as good as most humans on many technical tasks, so banning its use makes no sense) and concerns about a flood of disastrous content generated with AI due to ignorance, greed, or malice. Викидим (talk) 18:24, 2 December 2025 (UTC)[reply]

Content self feedback

I would like to suggest that the concept of closed loop system be considered and somehow discussed in the guideline. The LLM nightmare is when other sources pick half baked content from AI generated material, and said sources pick it up again themselves. The feedback can continue and eventually many sources will affirm each other. The term to use then is: jambalaya knowledge. Yesterday, all my dreams... (talk) 16:17, 29 November 2025 (UTC)[reply]

We do have WP:CITOGENESIS which describes this regarding Wikipedia, not quite the same but Wikipedia is a big feeder for AI training sets. Gnomingstuff (talk) 17:06, 29 November 2025 (UTC)[reply]

I did not know about that page, so thank you. The LLM problem is in fact a super turbocharged version of that. Yesterday, all my dreams... (talk) 20:59, 29 November 2025 (UTC)[reply]

We do have a mainspace article on model collapse which is the term for this phenomenon in large language models. It's not really relevant to this guideline specifically, though. Athanelar (talk) 14:29, 30 November 2025 (UTC)[reply]

Nutshell

@Novem Linguae: Nothing personal, but I challenge your assertion that this page is too short to have a nutshell. Having a modicum of humor helps keep this project from drowning in bureaucracy. — Hex • talk 14:38, 30 November 2025 (UTC)[reply]

Discussion at Wikipedia:Village pump (policy) § RfC: Replace text of Wikipedia:Writing articles with large language models

You are invited to join the discussion at Wikipedia:Village pump (policy) § RfC: Replace text of Wikipedia:Writing articles with large language models. –Novem Linguae (talk) 23:40, 5 December 2025 (UTC)[reply]