Wikipedia:Signs of AI writing

A screenshot of ChatGPT reading: "[header] Legacy & Interpretation [body] The "Black Hole Edition" is not just a meme — it's a celebration of grassroots car culture, where ideas are limitless and fun is more important than spec sheets. Whether powered by a rotary engine, a V8 swap, or an imagined fighter jet turbine, the Miata remains the canvas for car enthusiasts worldwide."
LLMs tend to have an identifiable writing style.

This is a list of writing and formatting conventions typical of AI chatbots such as ChatGPT, with real examples taken from Wikipedia articles and drafts. It is meant to act as a field guide to help detect undisclosed AI-generated content on Wikipedia. This list is descriptive, not prescriptive; it consists of observations, not rules. Advice about formatting or language to avoid in Wikipedia articles can be found in the policies and guidelines and the Manual of Style, but does not belong on this page.

This list is not a ban on certain words, phrases, or punctuation. No one is taking your em-dashes away or claiming that only LLMs use them. Not all text featuring the following indicators is AI-generated, as the large language models that power AI chatbots are trained on human writing, including the writing of Wikipedia editors. This is simply a catalog of very common patterns observed over many thousands of instances of AI-generated text, specific to Wikipedia. While some of its advice may be broadly applicable, some signs—particularly those involving punctuation and formatting—may not apply in a non-Wikipedia context.

The patterns here are also only potential signs of a problem, not the problem itself. While many of these issues are immediately obvious and easy to fix—e.g., excessive boldface, poor wordsmithing, broken markup, citation style quirks—they can point to less outwardly visible problems that carry much more serious policy risks. If LLM-generated text is polished enough (initially or subsequently tidied up), those surface defects might not be present, but the deeper problems likely will. Please do not merely treat these signs as the problems to be fixed; that could just make detection harder. The actual problems are those deeper concerns, so make sure to address them, either yourself or by flagging them, per the advice at Wikipedia:Large language models § Handling suspected LLM-generated content and Wikipedia:WikiProject AI Cleanup/Guide.

The speedy deletion policy criterion G15 (LLM-generated pages without human review) is limited to the most objective and least contestable indications that the page's content was generated by an LLM. There are three such indicators, the first of which can be found in § Communication intended for the user and the other two in § Citations. The other signs, though they may indeed indicate AI use, are not sufficient for speedy deletion.

Do not solely rely on artificial intelligence content detection tools (such as GPTZero) to evaluate whether text is LLM-generated. While they perform better than random chance, these tools have nontrivial error rates and cannot replace human judgment.[1]

Language and tone

[edit]

LLMs (and artificial neural networks in general) use statistical algorithms to guess (infer) what should come next based on a large corpus of training material. It thus tends to regress to the mean; that is, the result tends toward the most statistically likely result that applies to the widest variety of cases. It can simultaneously be a strength and a "tell" for detecting AI-generated content.

For example, LLMs are usually trained on data from the internet in which famous people are generally described with positive, important-sounding language. It will thus sand down specific, unusual, nuanced facts (which are statistically rare) and replace them with more generic, positive descriptions (which are statistically common). Thus the specific detail "invented a train-coupling device" might become "a revolutionary titan of industry." LLMs tend to smooth out unusual details and drift toward the most common, statistically probable way of describing a topic. It is like shouting louder and louder that a portrait shows a uniquely important person, while the portrait itself is fading from a sharp photograph into a blurry, generic sketch. The subject becomes simultaneously less specific and more exaggerated.[2]

This statistical regression to the mean, a smoothing over of specific facts into generic statements that could apply to many topics, makes AI-generated content easier to detect.

Undue emphasis on symbolism and importance

[edit]

LLM writing often puffs up the importance of the subject matter with reminders that it represents or contributes to a broader topic. There seems to be only a small repertoire of ways that it writes these reminders, so if they are otherwise appropriate it would be best to reword them anyway.

When talking about biology (e.g. when asked to discuss a given animal or plant species), LLMs tend to put too much emphasis on the species' conservation status and the efforts to protect it, even if the status is unknown and no serious efforts exist.

Examples

Douera enjoys close proximity to the capital city, Algiers, further enhancing its significance as a dynamic hub of activity and culture. With its coastal charm and convenient location, Douera captivates both residents and visitors alike[...]

— From this revision to Douéra

Berry Hill today stands as a symbol of community resilience, ecological renewal, and historical continuity. Its transformation from a coal-mining hub to a thriving green space reflects the evolving identity of Stoke-on-Trent.

Promotional language

[edit]

LLMs have serious problems keeping a neutral tone, especially when writing about something that could be considered "cultural heritage"—in which case they will constantly remind the reader that it is cultural heritage.

Examples

Nestled within the breathtaking region of Gonder in Ethiopia, Alamata Raya Kobo stands as a vibrant town with a rich cultural heritage and a significant place within the Amhara region. From its scenic landscapes to its historical landmarks, Alamata Raya Kobo offers visitors a fascinating glimpse into the diverse tapestry of Ethiopia. In this article, we will explore the unique characteristics that make Alamata Raya Kobo a town worth visiting and shed light on its significance within the Amhara region.

TTDC acts as the gateway to Tamil Nadu’s diverse attractions, seamlessly connecting the beginning and end of every traveller's journey. It offers dependable, value-driven experiences that showcase the state’s rich history, spiritual heritage, and natural beauty.

Editorializing

[edit]

LLMs often introduce their own interpretation, analysis, and opinions in their writing, even when they are asked to write neutrally, violating the policy No original research. Editorializing can appear through specific words or phrases or within broader sentence structures. This indicator often overlaps with other language and tone indicators in this list. Note that humans and especially new editors often make this mistake as well.

Examples

A defining feature of FSP models is their ability to simulate environmental interactions.

Their ability to simulate both form and function makes them powerful tools for understanding plant-environment interactions and optimizing performance under diverse biological and management contexts.

— From the same page

In this case, the whole sentence comprises an original opinion:

These partnerships reflect the company’s role in serving both corporate and community organizations in Uganda.

Overuse of certain conjunctions

[edit]

While human writing obviously contains connecting words and phrases, LLMs tend to overuse them, in a stilted, formulaic way. This is often a byproduct of an essay-like structure that implies synthesis of facts, which is typical of LLM writing but inappropriate for Wikipedia.

Examples

Compare the above with below human-written prose. Even disregarding the absence of buzzwords and general vapidness, the connecting phrases in the below excerpt are more varied and less conspicuous.

Section summaries

[edit]

LLMs will often end a paragraph or section by summarizing and restating its core idea.[3] While this may be permitted for some scholarly writing, proper Wikipedia writing typically never summarizes the general idea of a block of article text (besides the lead section being a summary of the entire article).

Examples

In summary, the educational and training trajectory for nurse scientists typically involves a progression from a master's degree in nursing to a Doctor of Philosophy in Nursing, followed by postdoctoral training in nursing research. This structured pathway ensures that nurse scientists acquire the necessary knowledge and skills to engage in rigorous research and contribute meaningfully to the advancement of nursing science.

Outline-like conclusions about challenges and future prospects

[edit]

Many LLM-generated Wikipedia articles include a "Challenges" section, which typically begins with a sentence like "Despite its [positive/promotional words], [article subject] faces challenges..." and ends with either a positive assessment of the article subject, or speculation about how ongoing or potential initiatives could benefit the subject. Such paragraphs usually appear at the end of articles with a rigid outline structure, which may also include a separate section for "Future Prospects."

Note: This sign is about the rigid formula, not simply the mention of challenges.

Examples

Despite its industrial and residential prosperity, Korattur faces challenges typical of urban areas, including[...] With its strategic location and ongoing initiatives, Korattur continues to thrive as an integral part of the Ambattur industrial zone, embodying the synergy between industry and residential living.

— From this revision to Korattur

Despite its success, the Panama Canal faces challenges, including[...] Future investments in technology, such as automated navigation systems, and potential further expansions could enhance the canal’s efficiency and maintain its relevance in global trade.

Despite their promising applications, pyroelectric materials face several challenges that must be addressed for broader adoption. One key limitation is[...] Despite these challenges, the versatility of pyroelectric materials positions them as critical components for sustainable energy solutions and next-generation sensor technologies.

The future of hydrocarbon economies faces several challenges, including resource depletion, environmental concerns, and the shift to sustainable energy sources. This section would speculate on potential developments and the changing landscape of global energy.

Operating in the current Afghan media environment presents numerous challenges, including the safety of journalists and financial constraints due to the Taliban's restrictions on independent media. Despite these challenges, Amu TV has managed to continue to provide a vital service to the Afghan population​​.

Negative parallelisms

[edit]

Parallel constructions involving "not", "but", or "however" such as "Not only ... but ..." or "It is not just about ..., it's ..." are common in LLM writing but are often unsuitable for writing in a neutral tone.[1]

Examples

Self-Portrait by Yayoi Kusama, executed in 2010 and currently preserved in the famous Uffizi Gallery in Florence, constitutes not only a work of self-representation, but a visual document of her obsessions, visual strategies and psychobiographical narratives.

It’s not just about the beat riding under the vocals; it’s part of the aggression and atmosphere.

Here is an example of a negative parallelism across multiple sentences:

He hailed from the esteemed Duse family, renowned for their theatrical legacy. Eugenio's life, however, took a path that intertwined both personal ambition and familial complexities.

Some parallelisms may follow the pattern of "No ..., no ..., just ...":

There are no long-form profiles. No editorial insights. No coverage of her game dev career. No notable accolades. Just TikTok recaps and callouts.

Rule of three

[edit]

LLMs overuse the 'rule of three'—"the good, the bad, and the ugly". This can take different forms from "adjective, adjective, adjective" to "short phrase, short phrase, and short phrase".[1] LLMs often use this structure to make superficial analyses appear more comprehensive.

Examples

The Amaze Conference brings together global SEO professionals, marketing experts, and growth hackers to discuss the latest trends in digital marketing. The event features keynote sessions, panel discussions, and networking opportunities.

Superficial analyses

[edit]

AI chatbots tend to insert superficial analysis of information, often in relation to its significance, recognition, or impact. This is often done by attaching a present participle ("-ing") phrase at the end of sentences, sometimes with vague attributions to third parties (see below). These comments are generally unhelpful as they introduce unnecessary or fictional opinions.

Examples

In 2025, the Federation was internationally recognized and invited to participate in the Asia Pickleball Summit, highlighting Pakistan’s entry into the global pickleball community.

Consumers benefit from the flexibility to use their preferred mobile wallet at participating merchants, improving convenience.

These citations, spanning more than six decades and appearing in recognized academic publications, illustrate Blois' lasting influence in computational linguistics, grammar, and neology.

The civil rights movement emerged as a powerful continuation of this struggle, emphasizing the importance of solidarity and collective action in the fight for justice.

Vague attributions of opinion

[edit]

AI chatbots tend to attribute opinions or claims to some vague authority—a practice called weasel wording—while citing only one or two sources that may or may not actually express such view. They also tend to overgeneralize a perspective of one or few sources into that of a wider group.

Examples

Here, the weasel wording implies the opinion comes from an independent source, but it actually cites Nick Ford's own website.

His [Nick Ford's] compositions have been described as exploring conceptual themes and bridging the gaps between artistic media.[a]

Due to its unique characteristics, the Haolai River is of interest to researchers and conservationists. Efforts are ongoing to monitor its ecological health and preserve the surrounding grassland environment, which is part of a larger initiative to protect China’s semi-arid ecosystems from degradation.

  1. ^ "About". Nick Ford. Retrieved 2025-06-25.

Avoidance of noun repetition

[edit]

Generative AI has a repetition-penalty code, such that it is not allowed to repeat words too often. Although this often helps in the case of adjectives and verbs, the problem arises with regard to nouns, particularly those that refer to an entry's main subject, as an LLM's attempt to use different nouns can result in it generating text that may confuse readers.[4] For instance, the output might give a main character's name and then repeatedly use a different synonym or related term (e.g., protagonist, key player, eponymous character) when mentioning it again. This reduces the unity of the article and causes cognitive load for the reader, who must mentally re-establish that the same thing is being talked about.

False range

[edit]

When giving examples of items within a set, AI chatbots will often mention these items within a phrase that reads "from ... to ...", which often results in a non-encyclopedic tone. This indicator is not to be confused with the prepositions' non-figurative usage, such as in spatial or temporal contexts (e.g. "... went from Chicago to Los Angeles", "... the library will be closed from Friday to Wednesday").

Examples

The essential components that form the foundation of Somali dishes encompass staples like rice and pasta, along with an extensive selection of meats ranging from lamb to beef and chicken.

Our journey through the universe has taken us from the singularity of the Big Bang to the grand cosmic web, from the birth and death of stars that forge the elements of life, to the enigmatic dance of dark matter and dark energy that shape its destiny.

[...]

Intelligence and Creativity: From problem-solving and tool-making to scientific discovery, artistic expression, and technological innovation, human intelligence is characterized by its adaptability and capacity for novel solutions.

[...]

Continued Scientific Discovery: The quest to understand the universe, life, and ourselves will continue to drive scientific breakthroughs, from fundamental physics to medicine and neuroscience.

Style

[edit]

Title case in section headings

[edit]

In section headings, AI chatbots strongly tend to consistently capitalize all main words (title case).[1]

Examples

Early Life and Education

Thomas was born in Cochranville, Pennsylvania. [...]

Applications in Racing

Thomas’s behavioral profiling has been used to evaluate Kentucky Derby [...]

Global Consulting

Thomas’s behavioral profiling has been used to evaluate Kentucky Derby and Breeders’ Cup contenders. [...]

International Speaking Engagements

In July 2025, Thomas was invited as a featured presenter to the Second Horse Economic Forum [...]

Educational Programs

Thomas is the founder of the Institute for Advanced Equine Studies [...]

Excessive use of boldface

[edit]

AI chatbots may display various phrases in boldface for emphasis in an excessive, mechanical manner. One of their tendencies, inherited from readmes, fan wikis, how-tos, sales pitches, slide decks, listicles and other materials that heavily use boldface, is to emphasize every instance of a chosen word or phrase, often in a "key takeaways" fashion. Some newer large language models or apps have instructions to avoid overuse of boldface.

Examples

It blends OKRs (Objectives and Key Results), KPIs (Key Performance Indicators), and visual strategy tools such as the Business Model Canvas (BMC) and Balanced Scorecard (BSC). OPC is designed to bridge the gap between strategy and execution by fostering a unified mindset and shared direction within organizations.

Lists

[edit]

AI chatbots often organize the contents of their responses into lists that are formatted in a particular way. A very common example is a "bullet points with bold titles" style, in which the content of each bullet point is a longer rewording of the bolded keyword preceding it.

Lists that are copied and pasted from AI chatbot responses may retain their original formatting. Instead of proper wikitext, a bullet point in an unordered list may appear as a bullet character (•), hyphen (-), en dash (–), or similar character. Ordered lists (i.e. numbered lists) may use explicit numbers (such as 1.) instead of standard wikitext.

Examples

1. Historical Context Post-WWII Era: The world was rapidly changing after WWII, [...] 2. Nuclear Arms Race: Following the U.S. atomic bombings, the Soviet Union detonated its first bomb in 1949, [...] 3. Key Figures Edward Teller: A Hungarian physicist who advocated for the development of more powerful nuclear weapons, [...] 4. Technical Details of Sundial Hydrogen Bomb: The design of Sundial involved a hydrogen bomb [...] 5. Destructive Potential: If detonated, Sundial would create a fireball up to 50 kilometers in diameter, [...] 6. Consequences and Reactions Global Impact: The explosion would lead to an apocalyptic nuclear winter, [...] 7. Political Reactions: The U.S. military and scientists expressed horror at the implications of such a weapon, [...] 8. Modern Implications Current Nuclear Arsenal: Today, there are approximately 12,000 nuclear weapons worldwide, [...] 9. Key Takeaways Understanding the Madness: The concept of Project Sundial highlights the extremes of human ingenuity [...] 10. Questions to Consider What were the motivations behind the development of Project Sundial? [...]

Emoji

[edit]

Sometimes, AI chatbots decorate section headings or bullet points by placing emojis in front of them.

Examples

Let’s decode exactly what’s happening here:
🧠 Cognitive Dissonance Pattern:
You’ve proven authorship, demonstrated originality, and introduced new frameworks, yet they’re defending a system that explicitly disallows recognition of originators unless a third party writes about them first.
[...]
🧱 Structural Gatekeeping:
Wikipedia policy favors:
[...]
🚨 Underlying Motivation:
Why would a human fight you on this?
[...]
🧭 What You’re Actually Dealing With:
This is not a debate about rules.
[...]

🪷 Traditional Sanskrit Name: Trikoṇamiti
Tri = Three
Koṇa = Angle
Miti = Measurement 🧭 “Measurement of three angles” — the ancient Indian art of triangle and angle mathematics.
🕰️ 1. Vedic Era (c. 1200 BCE – 500 BCE)
[...]
🔭 2. Sine of the Bow: Sanskrit Terminology
[...]
🌕 3. Āryabhaṭa (476 CE)
[...]
🌀 4. Varāhamihira (6th Century CE)
[...]
🌠 5. Bhāskarācārya II (12th Century CE)
[...]
📤 Indian Legacy Spreads

Overuse of em dashes

[edit]

While human editors may use em dashes (—), LLM output tends to use them more often than human-written text of the same genre, and uses them in places where humans are more likely to use commas, parentheses, colons, or (misused) hyphens (-). LLMs especially tend to use em-dashes in a formulaic, pat way, often mimicking "punching up" sales-like writing by over-emphasizing clauses or parallelisms.

This sign is most useful when taken in combination with other indicators, not by itself.

Examples

Elwandore is a virtual micronation for people with passion and skill — a place to build, to create, and to help each other grow while chasing wealth. But not wealth for greed — wealth to give, to help others, to donate.

The term “Dutch Caribbean” is not used in the statute and is primarily promoted by Dutch institutions, not by the people of the autonomous countries themselves. In practice, many Dutch organizations and businesses use it for their own convenience, even placing it in addresses — e.g., “Curaçao, Dutch Caribbean” — but this only adds confusion internationally and erases national identity. You don’t say “Netherlands, Europe” as an address — yet this kind of mislabeling continues.

Curly quotation marks and apostrophes

[edit]

AI chatbots typically use curly quotation marks (“...” or ‘...’) instead of straight quotation marks ("..." or '...'). In some cases, AI chatbots inconsistently use pairs of curly and straight quotation marks in the same response. They also tend to use the curly apostrophe (’; the same character as the curly right single quotation mark) instead of the straight apostrophe ('), such as in contractions and possessive forms. They may also do this inconsistently.

Curly quotes alone do not prove LLM use. Microsoft Word as well as macOS and iOS devices have a "smart quotes" feature that converts straight quotes to curly quotes. Grammar correcting tools such as LanguageTool may also have such a feature. Curly quotation marks and apostrophes are common in professionally typeset works such as major newspapers. Citation tools like Citer may repeat those that appear in the title of a web page: for example,

McClelland, Mac (2017-09-27). "When 'Not Guilty' Is a Life Sentence". The New York Times. Retrieved 2025-08-03.

Note that Wikipedia allows users to customize the fonts used to display text. Some fonts display matched curly apostrophes as straight, in which case the distinction is invisible to the user.

Letter-like writing

[edit]

Talk page messages and unblock requests generated by AI chatbots often include salutations and valedictions. Many messages emphasize a user's good faith and promise that the user will adhere to Wikipedia's guidelines. The presence of a subject line above the text, intended to fill the Subject line on an email form, is a more definitive tell.

Examples

Keep in mind that not all messages written this way are AI-generated. Letters and emails have conventionally been written in similar ways long before modern LLMs existed.

In addition, some human editors may mistakenly post emails, letters, petitions, or messages intended for the article's subject, frequently formatted as letters. While such edits are generally off-topic and may be removed per the guidelines at WP:NOTFORUM—particularly if they contain personal information—they are not necessarily LLM-generated.

Communication intended for the user

[edit]

Collaborative communication

[edit]

In some cases, editors will paste text from an AI chatbot that was meant as correspondence, prewriting or advice by the chatbot, rather than article content. AI chatbots may also explicitly indicate that the text is for a Wikipedia article if prompted to produce one, and may mention various policies and guidelines in their outputs—often explicitly specifying that they're Wikipedia's conventions.

Examples

This fictional article combines the tone of a Wikipedia article and the creative elements you requested, including the announcement date, release date, new cast, and crew for the sequel. Let me know if you'd like it expanded or tailored further!

Certainly. Here's a draft Wikipedia-style article for Mark Biram, written in a neutral, encyclopedic tone and formatted according to Wikipedia conventions. This assumes notability is supported by independent sources (which would need to be cited for a real Wikipedia page):

Final important tip: The ~~~~ at the very end is Wikipedia markup that automatically

In this section, we will discuss the background information related to the topic of the report. This will include a discussion of relevant literature, previous research, and any theoretical frameworks or concepts that underpin the study. The purpose is to provide a comprehensive understanding of the subject matter and to inform the reader about the existing knowledge and gaps in the field.

Including photos of the forge (as above) and its tools would enrich the article’s section on culture or economy, giving readers a visual sense of Ronco’s industrial heritage. Visual resources can also highlight Ronco Canavese’s landscape and landmarks. For instance, a map of the Soana Valley or Ronco’s location in Piedmont could be added to orient readers geographically. The village’s scenery [...] could be illustrated with an image. Several such photographs are available (e.g., on Wikimedia Commons) that show Ronco’s panoramic view, [...] Historical images, if any exist (such as early 20th-century photos of villagers in traditional dress or of old alpine trades), would also add depth to the article. Additionally, the town’s notable buildings and sites can be visually presented: [...] Including an image of the Santuario di San Besso [...] could further engage readers. By leveraging these visual aids – maps, photographs of natural and cultural sites – the expanded article can provide a richer, more immersive picture of Ronco Canavese.

Knowledge-cutoff disclaimers and speculation about gaps in sources

[edit]

A knowledge-cutoff disclaimer is a statement used by the AI chatbot to indicate that the information provided may be incomplete, inaccurate, or outdated.

If an LLM has a fixed knowledge cutoff (usually the model's last training update), it is unable to provide any information on events or developments past that time, and it will often output a disclaimer to remind the user of this cutoff, which usually takes the form of a statement that says the information provided is accurate only up to a certain date.

If an LLM with retrieval-augmented generation (for example, an AI chatbot that can search the web) fails to find sources on a given topic, or if information is not included in sources provided to it in a prompt, it will often output a statement to that effect, which is similar to a knowledge-cutoff disclaimer. It may also pair it with text about what that information "likely" may be and why it is significant. This information is entirely speculative (including the very claim that it's "not documented") and may be based on loosely related topics or completely fabricated. It is also frequently combined with the tells above.

Examples

While specific information about the fauna of Studniční hora is limited in the provided search results, the mountain likely supports...

Though the details of these resistance efforts aren't widely documented, they highlight her bravery...

No significant public controversies or security incidents affecting Outpost24 have been documented as of June 2025.

— From Draft:Outpost24

As of my last knowledge update in January 2022, I don't have specific information about the current status or developments related to the "Chester Mental Health Center" in today's era.

Below is a detailed overview based on available information:

  1. ^ not unique to AI chatbots; is produced by the {{as of}} template

Prompt refusal

[edit]

Occasionally, the AI chatbot will decline to answer a prompt as written, usually with an apology and a reminder that it is "an AI language model". Attempting to be helpful, it often gives suggestions or an answer to an alternative, similar request. Outright refusals have become increasingly rare.

Prompt refusals are obviously unacceptable for Wikipedia articles, but some users do include them. This may indicate that the user did not review the text, or that they may not have a proficient grasp on the English language. Remember to assume good faith, because the editor may genuinely want to improve our coverage of knowledge gaps.

Examples

As an AI language model, I can't directly add content to Wikipedia for you, but I can help you draft your bibliography.

Links to searches

When results appear in these searches, they are almost always problematic – but remember that it would be okay for an article to include them if, for example, they were in a relevant, attributed quote.

Phrasal templates and placeholder text

[edit]

AI chatbots may generate responses with fill-in-the-blank phrasal templates (as seen in the game Mad Libs) for the LLM user to replace with words and phrases pertaining to their use case. However, some LLM users forget to add such words. Note that non-LLM-generated templates exist for drafts and new articles, such as Wikipedia:Artist biography article template/Preload and pages in Category:Article creation templates.

Examples

Subject: Concerns about Inaccurate Information

Dear Wikipedia

I am writing to express my deep concern about the spread of misinformation on your platform. Specifically, I am referring to the article about [Entertainer's Name], which I believe contains inaccurate and harmful information.

Subject: Edit Request for Wikipedia Entry

Dear Wikipedia Editors,

I hope this message finds you well. I am writing to request an edit for the Wikipedia entry

I have identified an area within the article that requires updating/improvement. [Describe the specific section or content that needs editing and provide clear reasons why the edit is necessary, including reliable sources if applicable].

[URL of source confirming birth, if available], [URL of reliable source]

(Note: Actual Wikipedia articles require verifiable citations from independent sources. The following entries are placeholders to indicate where citations would go if sources were available.)

— From a speedily-deleted draft

Markup

[edit]

Use of Markdown

[edit]

AI chatbots are not proficient in wikitext, the markup language used to instruct Wikipedia's MediaWiki software how to format an article. As wikitext is mostly tied to a specific platform using a specific software (a wiki running on MediaWiki), it is a niche markup language, lacking wider exposure beyond Wikipedia and other MediaWiki-based platforms like Miraheze. As such, LLMs tend to lack wikitext-formatted training data—while the corpuses of chatbots did ingest millions of Wikipedia articles, these articles would not have been processed as text files containing wikitext syntax. This is compounded by the fact that most chatbots are factory-tuned to use another, conceptually similar but much more diversely applied markup language: Markdown. Their system-level instructions direct them to format outputs using it, and the chatbot apps render its syntax as formatted text on a user's screen, enabling the display of headings, bulleted and numbered lists, tables, etc, just as MediaWiki renders wikitext to make Wikipedia articles look like formatted documents.

When asked about its "formatting guidelines", a chatbot willing to reveal some of its system-level instructions will typically disclose some variation of the following (this is Microsoft Copilot in mid-2025):

## Formatting Guidelines

- All output uses GitHub-flavored Markdown.  
- Use a single main title (`#`) and clear primary subheadings (`##`).  
- Keep paragraphs short (3–5 sentences, ≤150 words).  
- Break large topics into labeled subsections.  
- Present related items as bullet or numbered lists; number only when order matters.  
- Always leave a blank line before and after each paragraph.  
- Avoid bold or italic styling in body text unless explicitly requested.  
- Use horizontal dividers (`---`) between major sections.  
- Employ valid Markdown tables for structured comparisons or data summaries.  
- Refrain from complex Unicode symbols; stick to simple characters.  
- Reserve code blocks for code, poems, lyrics, or similarly formatted content.  
- For mathematical expressions, use LaTeX outside of code blocks.

As the above already suggests, Markdown's syntax is completely different from wikitext's: Markdown uses asterisks (*) or underscores (_) instead of single-quotes (') for bold and italic formatting, hash symbols (#) instead of equals signs (=) for section headings, parentheses (()) instead of square brackets ([]) around URLs, and three symbols (---, ***, or ___) instead of four hyphens (----) for thematic breaks.

Even when they are told to do so explicitly, chatbots generally struggle to generate text using syntactically correct wikitext, as their training data lead to a drastically greater affinity for and fluency in Markdown. When told by a user to "generate an article", a chatbot will typically default to using Markdown for the generated output, which is preserved in clipboard text by the copy functions on some chatbot platforms. If instructed by a user to generate content for Wikipedia, the chatbot might itself "realize" the need to generate Wikipedia-compatible code, and might include a message like Would you like me to ... turn this into actual Wikipedia markup format (`wikitext`)?[a] in its output. If told to proceed, the resulting syntax will often be rudimentary, syntactically incorrect, or both. The chatbot might put its attempted-wikitext content in a Markdown-style fenced code block (its syntax for WP:PRE) surrounded by Markdown-based syntax and content, which may also be preserved by platform-specific copy-to-clipboard functions, leading to a telling footprint of both markup languages' syntax. This might include the appearance of three backticks in the text, such as: ```wikitext.[b]

The presence of faulty wikitext syntax mixed with Markdown syntax is a strong indicator that content is LLM-generated, especially if in the form of a fenced Markdown code block. However, Markdown alone is not such a strong indicator. Software developers, researchers, technical writers, and experienced internet users frequently use Markdown in tools like Obsidian and GitHub, and on platforms like Reddit, Discord, and Slack. Some writing tools and apps, such as iOS Notes, Google Docs, and Windows Notepad, may support Markdown editing or exporting. The increasing ubiquity of Markdown may also lead new editors to expect or assume Wikipedia to support Markdown by default.

Examples

I believe this block has become procedurally and substantively unsound. Despite repeatedly raising clear, policy-based concerns, every unblock request has been met with **summary rejection** — not based on specific diffs or policy violations, but instead on **speculation about motive**, assertions of being “unhelpful”, and a general impression that I am "not here to build an encyclopedia". No one has meaningfully addressed the fact that I have **not made disruptive edits**, **not engaged in edit warring**, and have consistently tried to **collaborate through talk page discussion**, citing policy and inviting clarification. Instead, I have encountered a pattern of dismissiveness from several administrators, where reasoned concerns about **in-text attribution of partisan or interpretive claims** have been brushed aside. Rather than engaging with my concerns, some editors have chosen to mock, speculate about my motives, or label my arguments "AI-generated" — without explaining how they are substantively flawed.

— From this revision to a user talk page

- The Wikipedia entry does not explicitly mention the "Cyberhero League" being recognized as a winner of the World Future Society's BetaLaunch Technology competition, as detailed in the interview with THE FUTURIST ([1](https://consciouscreativity.com/the-futurist-interview-with-dana-klisanin-creator-of-the-cyberhero-league/)). This recognition could be explicitly stated in the "Game design and media consulting" section.

Here, LLMs incorrectly use ## to denote section headings, which MediaWiki interprets as a numbered list.

    1. Geography

Villers-Chief is situated in the Jura Mountains, in the eastern part of the Doubs department. [...]

    1. History

Like many communes in the region, Villers-Chief has an agricultural past. [...]

    1. Administration

Villers-Chief is part of the Canton of Valdahon and the Arrondissement of Pontarlier. [...]

    1. Population

The population of Villers-Chief has seen some fluctuations over the decades, [...]

Broken wikitext

[edit]

As explained above, AI-chatbots are not proficient in wikitext and Wikipedia templates, leading to faulty syntax. A noteworthy instance is garbled code related to Template:AfC submission, as new editors might ask a chatbot how to submit their Articles for Creation draft; see this discussion among AfC reviewers.

Examples

Note the badly malformed category link:

[[Category:AfC submissions by date/<0030Fri, 13 Jun 2025 08:18:00 +0000202568 2025-06-13T08:18:00+00:00Fridayam0000=error>EpFri, 13 Jun 2025 08:18:00 +0000UTC00001820256 UTCFri, 13 Jun 2025 08:18:00 +0000Fri, 13 Jun 2025 08:18:00 +00002025Fri, 13 Jun 2025 08:18:00 +0000: 17498026806Fri, 13 Jun 2025 08:18:00 +0000UTC2025-06-13T08:18:00+00:0020258618163UTC13 pu62025-06-13T08:18:00+00:0030uam301820256 2025-06-13T08:18:00+00:0008amFri, 13 Jun 2025 08:18:00 +0000am2025-06-13T08:18:00+00:0030UTCFri, 13 Jun 2025 08:18:00 +0000 &qu202530;:&qu202530;.</0030Fri, 13 Jun 2025 08:18:00 +0000202568>June 2025|sandbox]]

turn0search0

[edit]

ChatGPT may include citeturn0search0 (surrounded by Unicode points in the Private Use Area) at the ends of sentences, with the "search" number increasing as the text progresses. These are places where the chatbot links to an external site, but a human pasting the conversation into Wikipedia has that link converted into placeholder code. This was first observed in February 2025.

A set of images in a response may also render as iturn0image0turn0image1turn0image4turn0image5. Rarely, other markup of a similar style, such as citeturn0news0 (example), citeturn1file0 (example), or citegenerated-reference-identifier (example), may appear.

Examples

The school is also a center for the US College Board examinations, SAT I & SAT II, and has been recognized as an International Fellowship Centre by Cambridge International Examinations. citeturn0search1 For more information, you can visit their official website: citeturn0search0

[edit]

contentReference, oaicite, and oai_citation

[edit]

Due to a bug, ChatGPT may add code in the form of :contentReference[oaicite:0]{index=0} in place of links to references in output text. Links to ChatGPT-generated references may be labeled with oai_citation.

Examples

:contentReference[oaicite:16]{index=16}

1. **Ethnicity clarification**

- :contentReference[oaicite:17]{index=17}
    * :contentReference[oaicite:18]{index=18} :contentReference[oaicite:19]{index=19}.
    * Denzil Ibbetson’s *Panjab Castes* classifies Sial as Rajputs :contentReference[oaicite:20]{index=20}.
    * Historian’s blog notes: "The Sial are a clan of Parmara Rajputs…” :contentReference[oaicite:21]{index=21}.

2. :contentReference[oaicite:22]{index=22}

- :contentReference[oaicite:23]{index=23}
    > :contentReference[oaicite:24]{index=24} :contentReference[oaicite:25]{index=25}.

#### 📌 Key facts needing addition or correction:

1. **Group launch & meetings**

*Independent Together* launched a “Zero Rates Increase Roadshow” on 15 June, with events in Karori, Hataitai, Tawa, and Newtown  [oai_citation:0‡wellington.scoop.co.nz](https://wellington.scoop.co.nz/?p=171473&utm_source=chatgpt.com).

2. **Zero-rates pledge and platform**

The group pledges no rates increases for three years, then only match inflation—responding to Wellington’s 16.9% hike for 2024/25  [oai_citation:1‡en.wikipedia.org](https://en.wikipedia.org/wiki/Independent_Together?utm_source=chatgpt.com).
[edit]

attribution and attributableIndex

[edit]

ChatGPT may add JSON-formatted code at the end of sentences in the form of ({"attribution":{"attributableIndex":"X-Y"}}), with X and Y being increasing numeric indices.

Examples

^[Evdokimova was born on 6 October 1939 in Osnova, Kharkov Oblast, Ukrainian SSR (now Kharkiv, Ukraine).]({"attribution":{"attributableIndex":"1009-1"}}) ^[She graduated from the Gerasimov Institute of Cinematography (VGIK) in 1963, where she studied under Mikhail Romm.]({"attribution":{"attributableIndex":"1009-2"}}) [oai_citation:0‡IMDb](https://www.imdb.com/name/nm0947835/?utm_source=chatgpt.com) [oai_citation:1‡maly.ru](https://www.maly.ru/en/people/EvdokimovaA?utm_source=chatgpt.com)

Patrick Denice & Jake Rosenfeld, Les syndicats et la rémunération non syndiquée aux États-Unis, 1977–2015, ‘‘Sociological Science’’ (2018).]({“attribution”:{“attributableIndex”:“3795-0”}})

ChatGPT may add the URL search parameter utm_source=chatgpt.com or, as of August 2025, utm_source=openai to URLs that it is using as sources.

Examples

Following their marriage, Burgess and Graham settled in Cheshire, England, where Burgess serves as the head coach for the Warrington Wolves rugby league team. [https://www.theguardian.com/sport/2025/feb/11/sam-burgess-interview-warrington-rugby-league-luke-littler?utm_source=chatgpt.com]

Vertex AI documentation and blog posts describe watermarking, verification workflow, and configurable safety filters (for example, person‑generation controls and safety thresholds). ([cloud.google.com](https://cloud.google.com/vertex-ai/generative-ai/docs/image/generate-images?utm_source=openai))

[edit]

Named references declared in references section but unused in article body

[edit]

Examples

== References ==
<references>
<ref name=official>{{Cite web |title=Extrinsic Music Group – NextGen Label |url=https://extrinsicmusicgroup.com/ |website=extrinsicmusicgroup.com |access-date=2025-04-24}}</ref>
</references>
[edit]

Non-existent categories

[edit]

LLMs sometimes hallucinate non-existent categories (which appear as red links) because their training set includes obsolete and renamed categories that they reproduce in new content. They may also treat ordinary references to topics as categories, thus generating non-existent categories. Note that this is also a common error made by new or returning editors.

Examples

[[Category:American hip hop musicians]]

rather than

[[Category:American hip-hop musicians]]

Citations

[edit]
[edit]

If a new article or draft has multiple citations with external links, and most of them are broken (error 404 pages), this is a clear sign of an AI-generated page, particularly if the dead links are not found in website archiving sites like Internet Archive or Archive Today. Most links become broken (see link rot) over time, but those factors make it unlikely that the link was ever valid.

Invalid DOI and ISBNs

[edit]

A checksum can be used to verify ISBNs. An invalid checksum is a very likely sign that an ISBN is incorrect, and citation templates will display a warning if so. Similarly, DOIs are more resistant to link rot than regular hyperlinks. Unresolvable DOIs and invalid ISBNs can be indicators of hallucinated references.

Related are DOIs that point to entirely different article and general book citations without pages. This passage, for example, was generated by ChatGPT.

Ohm's Law is a fundamental principle in the field of electrical engineering and physics that states the current passing through a conductor between two points is directly proportional to the voltage across the two points, provided the temperature remains constant. Mathematically, it is expressed as V=IR, where V is the voltage, I is the current, and R is the resistance. The law was formulated by German physicist Georg Simon Ohm in 1827, and it serves as a cornerstone in the analysis and design of electrical circuits [1]. Ohm’s Law applies to many materials and components that are "ohmic," meaning their resistance remains constant regardless of the applied voltage or current. However, it does not hold for non-linear devices like diodes or transistors [2][3].

References:

1. Dorf, R. C., & Svoboda, J. A. (2010). Introduction to Electric Circuits (8th ed.). Hoboken, NJ: John Wiley & Sons. ISBN 9780470521571.

2. M. E. Van Valkenburg, “The validity and limitations of Ohm’s law in non-linear circuits,” Proceedings of the IEEE, vol. 62, no. 6, pp. 769–770, Jun. 1974. doi:10.1109/PROC.1974.9547

3. C. L. Fortescue, “Ohm’s Law in alternating current circuits,” Proceedings of the IEEE, vol. 55, no. 11, pp. 1934–1936, Nov. 1967. doi:10.1109/PROC.1967.6033

The book references appear valid – a book on electric circuits would likely have information about Ohm's law, but without the page number, the citation is not usable for verification of the claims in the prose. Worse, both Proceedings of the IEEE citations are completely made up. The DOIs lead to completely different citations and have other problems as well. For instance, C. L. Fortescue was dead for 30+ years at the purported time of writing, and Vol 55, Issue 11 does not list any articles that match anything remotely close to the information given in reference 3. Note also the use of curly quotation marks and apostrophes in some, but not all, of the above text, another indicator that text may be LLM-generated.

Incorrect or unconventional use of references

[edit]

AI tools may have been prompted to include references, and make an attempt to do so as Wikipedia expects, but fail with some key implementation details or stand out when compared with conventions.

In the below example, note the incorrect attempt at re-using references. The tool used here was not capable of searching for non-confabulated sources (as it was done the day before Bing Deep Search launched) but nonetheless found one real reference. The syntax for re-using the references was incorrect.

In this case, the Smith, R. J. source – being the "third source" the tool presumably generated the link 'https://pubmed.ncbi.nlm.nih.gov/3' (which has a PMID reference of 3) – is also completely irrelevant to the body of the article. The user did not check the reference before they converted it to a {{cite journal}} reference, even though the links resolve.

The LLM in this case has diligently included the incorrect re-use syntax after every single full stop.

For over thirty years, computers have been utilized in the rehabilitation of individuals with brain injuries. Initially, researchers delved into the potential of developing a "prosthetic memory."<ref>Fowler R, Hart J, Sheehan M. A prosthetic memory: an application of the prosthetic environment concept. ''Rehabil Counseling Bull''. 1972;15:80–85.</ref> However, by the early 1980s, the focus shifted towards addressing brain dysfunction through repetitive practice.<ref>{{Cite journal |last=Smith |first=R. J. |last2=Bryant |first2=R. G. |date=1975-10-27 |title=Metal substitutions incarbonic anhydrase: a halide ion probe study |url=https://pubmed.ncbi.nlm.nih.gov/3 |journal=Biochemical and Biophysical Research Communications |volume=66 |issue=4 |pages=1281–1286 |doi=10.1016/0006-291x(75)90498-2 |issn=0006-291X |pmid=3}}</ref> Only a few psychologists were developing rehabilitation software for individuals with Traumatic Brain Injury (TBI), resulting in a scarcity of available programs.<sup>[3]</sup> Cognitive rehabilitation specialists opted for commercially available computer games that were visually appealing, engaging, repetitive, and entertaining, theorizing their potential remedial effects on neuropsychological dysfunction.<sup>[3]</sup>

Some LLMs or chatbot interfaces use their own method of providing footnotes, typically using the character :

References

Would you like help formatting and submitting this to Wikipedia, or do you plan to post it yourself? I can guide you step-by-step through that too.

Footnotes

  1. KLAS Research. (2024). Top Performing RCM Vendors 2024. https://klasresearch.com ↩ ↩2
  2. PR Newswire. (2025, February 18). CureMD AI Scribe Launch Announcement. https://www.prnewswire.com/news-releases/curemd-ai-scribe ↩

Miscellaneous

[edit]

Abrupt cut offs

[edit]

AI tools may abruptly stop generating content, for example if they predict the end of text sequence (appearing as <|endoftext|>) next. Also, the number of tokens that a single response has is usually limited and further responses will require the user to select "continue generating".

This method is not foolproof, as a malformed copy/paste from one's local computer can also cause this. It may also indicate a copyright violation rather than the use of an LLM.

Discrepancies in writing style and variety of English

[edit]

A sudden shift in an editor's writing style, such as unexpectedly flawless grammar compared to their other communication, may indicate the use of AI tools.

Another discrepancy is a mismatch of user location, national ties of the topic to a variety of English, and the variety of English used. A human writer from India writing about an Indian university would probably not use American English; however, LLM outputs use American English by default, unless prompted otherwise.[3] Note that non-native English speakers tend to mix up English varieties, and such signs should only raise suspicion if there is a sudden and complete shift in an editor's English variety use.

Age of text relative to ChatGPT launch

[edit]

ChatGPT was launched to the public on November 30, 2022. Although OpenAI had similarly powerful LLMs before then, they were paid services and not particularly accessible or known to lay people. ChatGPT experienced extreme growth immediately on launch.

It is very unlikely that any particular text added to Wikipedia prior to November 30, 2022 was generated by an LLM. If an edit to a page was made before this date, AI use can be safely ruled out for that revision. While some text added as far back as 20 years ago may appear to match some of the AI signs given in this list, and even convincingly appear to have been AI generated, the vastness of Wikipedia allows for these rare coincidences.

Overwhelmingly verbose edit summaries

[edit]

AI-generated edit summaries are often unusually long, written as formal, first-person paragraphs without abbreviations, and/or conspicuously itemize Wikipedia's conventions.

Most editors using AI do not ask for summaries to be generated.

Refined the language of the article for a neutral, encyclopedic tone consistent with Wikipedia's content guidelines. Removed promotional wording, ensured factual accuracy, and maintained a clear, well-structured presentation. Updated sections on history, coverage, challenges, and recognition for clarity and relevance. Added proper formatting and categorized the entry accordingly

— Edit summary from this revision to Khaama Press

I formalized the tone, clarified technical content, ensured neutrality, and indicated citation needs. Historical narratives were streamlined, allocation details specified with regulatory references, propagation explanations made reader-friendly, and equipment discussions focused on availability and regulatory compliance, all while adhering to encyclopedic standards.

— Edit summary from this revision to 4-metre band

**Edit Summary:** Reorganized article for clarity and neutrality; refined phrasing to align with **WP:NPOV** and **WP:BLPCRIME**; standardized formatting and citation styles; improved flow by separating professional achievements from legal issues; updated infobox with complete details; fixed broken references and inconsistencies in date formatting.

— Edit summary from this revision to David Bitel

See also

[edit]

Notes

[edit]
  1. ^ Example (deleted, administrators only)
  2. ^ Example of ```wikitext on a draft.

References

[edit]
  1. ^ a b c d Russell, Jenna; Karpinska, Marzena; Iyyer, Mohit (2025). People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vienna, Austria: Association for Computational Linguistics. pp. 5342–5373. arXiv:2501.15654. doi:10.18653/v1/2025.acl-long.267. Retrieved 2025-09-05 – via ACL Anthology.
  2. ^ This can be directly observed by examining images generated by text-to-image models; they look acceptable at first glance, but specific details tend to be blurry and malformed. This is especially true for background objects and text.
  3. ^ a b "2025 findings". ACL Anthology. (Original PDF)
  4. ^ "10 Ways AI Is Ruining Your Students' Writing". Chronicle of Higher Education. September 16, 2025.