User talk:Polygnotus/Scripts/AI Source Verification

Questions

Polygnotus, your screenshot raises a few questions for me.

Does it treat every "claim" as the preceding sentence? Ideally, a claim would be all text since the last citation.
Are the results a binary "SUPPORTED" or "UNSUPPORTED"?
What does the AI report back if it can't access a source? This was an issue I've seen in earlier experiments, where for example the AI would hit a journal abstract and not have access to the document, and report back that text was not supported.
Does this work for pdfs or other file types?

Best, CMD (talk) 12:58, 12 June 2025 (UTC)[reply]

Currently, that is one of the things to fix.
No, the options are SUPPORTED, PARTIALLY SUPPORTED, NOT SUPPORTED, or UNCLEAR
Something like "I can't access the source", so the verdict would be UNCLEAR
Claude is able to deal with PDFs, I haven't checked the others.

Polygnotus (talk) 13:07, 12 June 2025 (UTC)[reply]

Thanks. And in plain language, if I installed and tested this script would that be using up a certain amount of 'credits' from whatever AI account this is accessing? CMD (talk) 13:12, 12 June 2025 (UTC)[reply]

@Chipmunkdavis Gemini is free (although you need to create a free API key, and there are limits of course). Claude and ChatGPT cost money (but very small amounts per query). Polygnotus (talk) 13:13, 12 June 2025 (UTC)[reply]

So the script tries to access a Gemini account my browser is logged into? CMD (talk) 13:16, 12 June 2025 (UTC)[reply]

No, that would be unsafe. Google AI Studio provides you with a free API key. You paste that API key in the script and that way Google knows if you haven't hit your limit yet. See also API. The API key is stored in the browser (in localStorage) and you can delete it at any time by pressing the "Delete API key"-button. But without an API key the script won't work. Polygnotus (talk) 13:19, 12 June 2025 (UTC)[reply]

@Chipmunkdavis: Using Claude (my favorite) costs a fraction of a cent. There are AI providers that are even cheaper, like DeepSeek, but DeepSeek's quality is not up to par. Polygnotus (talk) 13:25, 12 June 2025 (UTC)[reply]

Install

Hi Polygnotus, I installed it ("Install" button: Special:Diff/1312595956/1324207087), but don't see any way to activate the application. I'm surely doing something simple wrong. -- GreenC 06:08, 26 November 2025 (UTC)[reply]

@GreenC Note that this is very very far from a finished product. You should be able to see a new tab called "Verify" next to Article and Talk when looking at an article. Polygnotus (talk) 06:12, 26 November 2025 (UTC)[reply]

@GreenC Sorry I forgot to add that you then have to click on of those little ^[1] things. Polygnotus (talk) 11:55, 26 November 2025 (UTC)[reply]

Still nothing. It's possible other installed programs are interfering, I have a lot including Factotum. -- GreenC 21:29, 26 November 2025 (UTC)[reply]

It started working, not sure what happened before (still no tab that says "Verify")). The window pops up, I entered the key, all is good. But there is an empty box in the Verification results. Like no verification was done. -- GreenC 01:49, 27 November 2025 (UTC)[reply]

@GreenC This edit improved support for the various skins. The idea is that you add the API key, click on a ^[1] and then press the verify claim button. After a short while the "Verification Result" box should contain the response from the LLM. I'll add some logging to the browser console soon. If I may ask, which browser/operating system are you using? Polygnotus (talk) 02:24, 27 November 2025 (UTC)[reply]

At Insecure direct object reference, "Verify claim" for footnote #1 then nothing happens. Does it work for you? -- GreenC 02:28, 27 November 2025 (UTC)[reply]

Yep, works fine for me. The output is:

Based on the URL provided, the source is from PortSwigger, a well-known company in the field of web security, particularly recognized for their web security tool, Burp Suite. The URL indicates that the page is about "access control" and specifically "idor," which stands for Insecure Direct Object Reference.
From general knowledge and the context provided by the URL, Insecure Direct Object Reference (IDOR) is indeed a type of access control vulnerability. It occurs when an application provides direct access to objects based on user-supplied input, without proper authorization checks. This can allow attackers to access unauthorized data or perform actions they shouldn't be able to.
Given the reputation of PortSwigger as a reliable source for web security information, it is reasonable to conclude that the page likely discusses IDOR as an access control vulnerability.
Verdict: SUPPORTED
Explanation: The claim that IDOR is a type of access control vulnerability is consistent with general knowledge of web security and the context provided by the PortSwigger URL, which is a credible source in this field. However, full verification would require accessing the complete content of the source.

I used ChatGPT for this example. Do you see any errors in the browser console? Polygnotus (talk) 02:34, 27 November 2025 (UTC)[reply]

I'm using Gemini with a known working key. The Browser Console (Firefox) has numerous errors I am not sure how to interpret most of them related to "NS_ERROR_NOT_AVAILABLE" which is connected to "PushService" for example:

Given tab is not restoring. SessionStore.sys.mjs:7626:15
    _resetLocalTabRestoringState resource:///modules/sessionstore/SessionStore.sys.mjs:7626
    _restoreTabContentComplete resource:///modules/sessionstore/SessionStore.sys.mjs:8089
    _restoreTabContent resource:///modules/sessionstore/SessionStore.sys.mjs:7967

And..

PushService: receivedPushMessage: Error notifying app NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIPushNotifier.notifyPushWithData]
    _notifyApp resource://gre/modules/PushService.sys.mjs:1005
    _decryptAndNotifyApp resource://gre/modules/PushService.sys.mjs:871
PushService.sys.mjs:781:22

And..

Error adding url classifier exception list entry Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIUrlClassifierExceptionList.addEntry] NS_ERROR_ILLEGAL_VALUE: Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIUrlClassifierExceptionList.addEntry]
    notifyObservers resource://gre/modules/UrlClassifierExceptionListService.sys.mjs:147
    addAndRunObserver resource://gre/modules/UrlClassifierExceptionListService.sys.mjs:31
    registerAndRunExceptionListObserver resource://gre/modules/UrlClassifierExceptionListService.sys.mjs:322
 
XPCWrappedNative_NoHelper { QueryInterface: QueryInterface(), init: init(), matches: matches(), category: Getter, urlPattern: Getter, topLevelUrlPattern: Getter, isPrivateBrowsingOnly: Getter, filterContentBlockingCategories: Getter, classifierFeatures: Getter, CATEGORY_INTERNAL_PREF: 0, … }
UrlClassifierExceptionListService.sys.mjs:149:17

-- GreenC 02:50, 27 November 2025 (UTC)[reply]

@GreenC Ah thank you that is just me being stupid, sorry. I re-used the code from my earlier projects, but didn't really do any testing. I can email you a Claude API key if you want so you can give that a try.

I have to rewrite the entire script at some point. Currently it just tells the LLMs to fetch the url, which sometimes doesn't work. Ideally the script should get the URL, strip the irrelevant stuff and then feed the relevant stuff to the LLM. I have some irl obligations so its probably easiest if I just send you the Claude API key. Polygnotus (talk) 03:33, 27 November 2025 (UTC)[reply]

Thanks for the offer. I don't want to compromise your key, let me check if I can get one from work first. I'm definitely interested to see how well AI does verifying citations. -- GreenC 04:00, 27 November 2025 (UTC)[reply]

@GreenC Ah, no worries. I've emailed you the first part of a new unused key and I will delete it after a while. The remaining balance is ~$6 and auto reload is disabled.

Add "R9TCQwAA" to the key found in your email and you got a complete Claude API key.

I may have to make some API for MiniCheck soon but my todolist is pretty long. Polygnotus (talk) 04:10, 27 November 2025 (UTC)[reply]

Great, worked right away. MiniCheck would be great. I'll run some experiments with Claude for now and see how it does. -- GreenC 04:18, 27 November 2025 (UTC)[reply]

The tables are very helpful. Does the tool create tables automatically? That would be the bomb, hit a button and a table is created on the talk page (or somewhere), including the explanation. -- GreenC 15:27, 29 November 2025 (UTC)[reply]

@GreenC The code on Wiki does not yet do that, but the latest version does. Note that formatting stuff as a wikitable is just one Claude API call away. ;-) Polygnotus (talk) 22:21, 29 November 2025 (UTC)[reply]

Verifying offline sources

I've started a thread about it at Wikipedia_talk:WikiProject_AI_Tools#Checking_offline_sources, would be happy to hear your thoughts. Alaexis_¿question? 11:22, 29 November 2025 (UTC)[reply]

I'll respond over there, thanks, Polygnotus (talk) 11:23, 29 November 2025 (UTC)[reply]

@Polygnotus, thanks for adding this functionality.

Can you measure the adoption of your tool? Curious about the feedback you're getting. I suspect that getting an API key could be a major friction point. I wonder if there is a way to get a budget of (open-source?) credits for autoconfirmed editors somewhere. Alaexis_¿question? 17:15, 9 December 2025 (UTC)[reply]

@Alaexis Hi! I have intentionally not advertised these tools yet, for example the AI Source Verification tool is not listed on my userpage and not in the Wikipedia:User_scripts/List.

On the one hand that is because the community hasn't really decided how to deal with stuff like this yet, on the other hand I am kinda lazy and on the third I think that the User:Polygnotus/CitationVerification-method would be even better. It could for example be a service where people fill in a form online to have an article checked, and then a post on the talkpage of that article follows, something like that.

Despite that a few people have added it to their common.js, probably because I mentioned it on a usertalkpage.

Usually the adoption follows the Wikipedia:Scripts++ newsletter. Pretty sure I can't hide it much longer, see Wikipedia:Scripts++/Next.

Indeed, most sane people don't have Claude API keys laying around. Gemini is free tho.

The WMF has an insane amount of money, but it is difficult to convince them to spend it on something other than gold-plated Labubus.

This is why I made the User:Polygnotus/Scripts/Backlog.js thing with pregenerated suggestions the AI proofreader. Polygnotus (talk) 17:32, 9 December 2025 (UTC)[reply]

I think you might want to advertise it at other venues as well (such as Wikipedia_talk:WikiProject_Reliability, RS noticeboard, etc.). I only learned about the existence of the Scripts++ newsletter from you today :) Alaexis_¿question? 19:24, 9 December 2025 (UTC)[reply]

@Alaexis Problem is that Wikipedia sucks for script development.

There are various proposed solutions each with their unique up- and downsides,, like using github and having a bot to keep the script onwiki in sync.

I believe that in an ideal world I could just throw a public domain license on my script, have some public repo where people can work on it and file issues/bug reports.

Getting many users is not really a problem, having a large installbase can be.

Special:WhatLinksHere/User:Polygnotus/DuplicateReferences.js has the most installations of the stuff in my userspace I believe. Polygnotus (talk) 19:28, 9 December 2025 (UTC)[reply]

It's also not ideal for chatting. Can I find you on some other platform? I'd like to run some ideas by you. Alaexis_¿question? 20:44, 9 December 2025 (UTC)[reply]

@Alaexis Yeah its not ideal for a lot of things. I have a Discord. Polygnotus (talk) 21:09, 9 December 2025 (UTC)[reply]

Pinged you there. Alaexis_¿question? 21:43, 9 December 2025 (UTC)[reply]

Ping Alaexis_¿question? 21:42, 10 December 2025 (UTC)[reply]

@Polygnotus just learned about this open-source free model. Haven't tested it yet but assuming it works it sounds like a perfect match for your apps. Alaexis_¿question? 20:32, 11 December 2025 (UTC)[reply]

Feedback

@Polygnotus, I use your tool regularly and for the most part it works really well. I was particularly impressed by the PDF support. This is my feedback, in the descending order of importance.

Sometimes the boundaries of the claim are determined incorrectly. The tool thinks that the last instance of this reference support the whole paragraph starting from "In the afternoon of 9 August..." while in fact in supports only the last sentence of it ("The number of Russian forces deployed").
Verification result - now it contains the *previous* result until the verification is done. This is confusing. I'd suggest clearing the previous results when the user clicks on a citation.
When there is a regular link and an archived one, the regular one is often stale. I'd suggest using the archived one by default. Example [1]
When it hits a paywall it claims that "Unfortunately, I'm unable to directly fetch content from URLs in real-time" which is false and confusing to the user. Example [2]
When it's unable to fetch source text, the response is way too wordy, see the example below. It would be enough to say "Result: Unclear. Failed to fetch source text"

Extended content

“

I'll fetch the content from the provided URL and verify the claim against it.

After attempting to access the URL http://en.rian.ru/analysis/20080813/116006455.html, I found that the content is no longer available. The URL appears to be from RIA Novosti's English service, but the specific article cannot be retrieved, likely due to the age of the content (from 2008) and changes to the website structure over time. Verdict: UNCLEAR Explanation: Without access to the source content, I cannot verify whether the claim is supported by the article. The claim makes specific assertions about: Georgian President Saakashvili making a peace settlement proposal for South Ossetia The proposal being made at the Parliamentary Assembly of the Council of Europe in Strasbourg The timing being January 2005 The proposal being for South Ossetia within a unified Georgian state South Ossetian leader Eduard Kokoity rejecting the proposal Since the source URL is inaccessible, I cannot confirm or refute any of these specific details. To properly fact-check this claim, an alternative accessible source would be needed that covers the same events and timeframe. Recommendation: To verify this claim, you would need to: Find an alternative source that covers Georgian-South Ossetian relations in January 2005 Look for records from the Parliamentary Assembly of the Council of Europe from that period Consult other news archives or academic sources about Saakashvili's peace initiatives during that timeframe

”

Alaexis_¿question? 21:52, 11 December 2025 (UTC)[reply]