Most of what you read about Generative Engine Optimization is marketing for vendors selling useless solutions. This guide tries the opposite: state openly what the data proves, what it does not, and what the industry keeps repeating out of inertia even when the evidence says otherwise.

GEO is no longer a niche in 2026. Gartner released the 2026 CMO Spend Survey and the number is clear: CMOs allocate 15.3% of marketing budgets to AI, but only 30% of organizations claim to be ready to scale these capabilities. The gap is the operational terrain of GEO. Whoever understands what actually works builds a defensible advantage before the rest of the market catches up.

This guide is the operational synthesis of what works. Three principles run through it: rigor on evidence, honesty about limits, applied tone. No lists of best practices taken for granted, no unverifiable promises, no tactics that require proprietary software to be implemented.

Definition of Generative Engine Optimization

Generative Engine Optimization (GEO) is the discipline that increases the probability that a brand gets cited and recommended inside answers generated by AI engines like ChatGPT, Gemini, Perplexity, Claude, and Google AI Overviews. The difference from classical SEO is structural, not terminological.

The framework was first formalized in GEO: Generative Engine Optimization, a paper presented at SIGKDD 2024 by researchers from Princeton, Georgia Tech, Allen AI, and IIT Delhi. The authors introduced GEO-bench, a dataset of ten thousand queries, and demonstrated that targeted content optimizations can boost visibility in AI citations by up to 40%. The field has expanded since then. The paper GEO: How to Dominate AI Search by Chen et al., published in September 2025, showed that AI Search systems systematically favor earned media over brand-owned content. A finding that on its own rewrites the strategy of many companies.

For those who want to start from the conceptual basics, we have published a dedicated piece on what Generative Engine Optimization is and a comparison of how GEO differs from traditional SEO.

The state of the Italian market in 2026

Italy is not behind. According to Comscore data aggregated in October 2025, around sixteen million Italians use AI applications every month. ChatGPT alone reaches almost fifteen million of them. Perplexity grew 2,351% year over year, the highest growth rate recorded in the category in the country (Orizzonte Scuola, 2025).

The Artificial Intelligence Observatory at Politecnico di Milano estimates that the Italian AI market grew 50% in 2025, reaching 1.8 billion euros. 84% of large enterprises have already activated GenAI licenses (Osservatorio Polimi, 2025). These numbers make clear why the Italian market needs its own measurement standards. For this reason we launched Refinea Analysis, the public observatory of the AI Visibility Index for Italian industries.

How AI engines really work

Knowing what to optimize requires first knowing how engines build their answers. The mechanics changed significantly between 2024 and 2026.

ChatGPT

ChatGPT Search relies on third-party search providers, including Microsoft Bing, and on content from media partners (official OpenAI documentation). The dedicated bot is OAI-SearchBot and should be distinguished from GPTBot, which collects data for training.

Claude

Anthropic published an engineering post on the multi-agent system describing the lead-agent-plus-subagent architecture, which according to Anthropic internal benchmarks beats the single-agent pattern by 90.2%. The practical GEO consequence is that Claude runs deeper and less superficial searches than a single iteration. Dense and well-structured content has higher chances of getting cited.

Perplexity

Perplexity uses its own Sonar system, integrated with multiple web sources. The citation pattern is the most transparent of all engines because Perplexity explicitly shows the sources used to generate every answer. This also makes it the most rigorous testing ground for validating GEO strategies.

Google AI Overviews

Available in Italy since March 2025 in the EU rollout. Google has not published official ranking factors for AI Overviews. All the indications you read out there about “factor X counts Y%” come from vendor correlation studies, never from Google sources. The usual rule applies: treat these claims as operational hypotheses, not facts.

The seven evidences that change 2026 strategy

This is where we get to operational ground. Seven evidences, each grounded in public data, each with a direct tactical implication.

1. Earned media beats owned media

The Chen et al. paper cited above demonstrated that AI Search systems systematically prioritize third-party editorial coverage over content published on the brand’s own site. The figure is not marginal: in some categories the ratio exceeds three to one in favor of earned media.

The operational implication changes the editorial plan. Investing ten hours a week writing on your own blog produces less citation rate than investing the same ten hours positioning the brand in industry publications, podcasts, analyst research. This does not mean the blog is useless. It means the blog is defensive infrastructure, while the offense is played outside your own domain.

2. Reddit, Wikipedia, and YouTube dominate citations

Similarweb’s analysis of ChatGPT citations in the United States found that Wikipedia (13.15%) and Reddit (11.97%) together generate more than 25% of all citations. Publications considered top-tier traditional media like Wall Street Journal, New York Times, and Bloomberg do not appear in the top 20.

On Perplexity the concentration is even more extreme: Reddit covers 46.7% of the top-10 sources cited according to Profound’s analysis. Search Engine Land data confirms that Reddit, YouTube, and LinkedIn are the three most cited sources aggregated across the main AI engines.

The consequence is simple and uncomfortable for many B2B brands. A GEO strategy that ignores presence on Reddit, on YouTube, and on Wikipedia starts already losing. It is not optional, it is the main terrain where the match is played.

3. Schema.org does not move AI citations

This is the most contrarian evidence of 2026. Ahrefs published in May 2026 a rigorous study on 1,885 pages with a control group of 4,000, measuring AI citations before and after adding schema markup. The result was zero. Actually, on Google AI Overviews the study recorded a 4.6% drop in citations for pages with schema versus the prior period. One caveat: the study analyzed pages that already had strong AI citation baseline, so the result applies most directly to brands with established AI visibility.

The GEO industry has been selling schema markup as the first optimization lever for two years. The data says it does not move the needle. Search Engine Land synthesized the result with the headline “no hype”. We continue to recommend schema for the advantage on traditional Google rich results, but we no longer present it as a GEO lever.

4. llms.txt is vendor marketing

Riding the 2024 enthusiasm, the industry announced that llms.txt would become the standard for telling AI crawlers which content to prioritize. Two years later, the collected data proves the opposite. A recent study on 500 million AI bot visits over 90 days found only 408 actual fetches of the llms.txt file. Practically zero.

Adoption across domains plateaued around 10%. Our operational recommendation is: publish it because it costs nothing and serves as insurance policy, but do not include it in the GEO strategy deliverables for someone paying you.

5. Passage-level structure matters more than total length

Discovered Labs analyzed two million AI citations across ten thousand pages and found recurring structural patterns in cited passages. Related research on Google AI Overviews (Wellows, 2026) indicates that cited passages tend to have an average length between 134 and 167 words and to answer self-sufficiently to a single question. A 3,000-word article structured as a monolith produces fewer citations than a 1,500-word article structured as ten 150-word units, each answering a sub-question.

The consequence is the BLUF pattern: the first paragraph must answer the article’s main question self-sufficiently. Everything that follows is expansion, not preamble.

6. Recency is a real factor

Seer Interactive studied AI bot crawl patterns finding that 65% of AI bot hits land on content published in the past year and 79% on content from the past two years. The study is single-source and should be treated with methodological caution, but the pattern is consistent with what we observe in Refinea data.

The operational implication is not “publish more”. It is “keep fresh what you already have”. Substantially updating an evergreen article every six months produces more GEO value than writing two new articles in the same time.

7. Cross-source consistency beats single citation

McKinsey published in late 2025 a report estimating $750 billion of B2B spending funneled through AI search by 2028. In the report McKinsey observes that brand-owned content represents only 5-10% of the sources AI engines use to generate answers. The remaining 90-95% is composed of third parties.

The strategic consequence is that GEO is not the optimization of a single asset. It is the coordination of brand narrative across dozens of external sources that AI engines consult in parallel. Brands that say different things on Wikipedia, on Reddit, and on their own site produce inconsistent retrieval. Cross-source narrative coherence is one of the most underrated operational levers of 2026.

The Refinea operational framework

Based on the seven evidences above, we built an operational framework in four phases. It is what we apply internally and what our product is built on.

Phase 1: baseline measurement

Before optimizing you need to know where you start. Baseline measurement includes three components.

The brand’s AI Visibility Index on prompts that reflect the commercial category of interest. The distribution of citation sources that engines use when talking about the brand. The gap between the narrative the brand publishes on its own site and the one that emerges from AI answers. The Refinea platform automates these three measurements, but the principle holds also for those who measure manually.

Phase 2: prompt intelligence

A GEO strategy based on invented prompts is a strategy that optimizes for hypothetical scenarios. Most of the tools on the market work this way. The problem is that AI engines get queried by real users with real language, not with prompts constructed at a desk. Measurement on invented prompts produces data that looks actionable and instead does not describe market behaviour.

The correct pattern is to start from real queries. Refinea extracts aggregated search demand from premium providers, applies semantic clustering, simulates the corresponding intents, and validates the output against a database of more than one million real prompts. The platform then crosses these prompts with the customer’s Google Search Console data to weigh the relative importance of each cluster based on the real traffic the brand already attracts.

Phase 3: cross-source optimization

Optimization operates on three parallel planes.

On the editorial plane you work on owned content for defensive coverage. BLUF pattern, passage-level structure, recurring updates, image alt text, semantic internal linking. They do not move citations on their own but they are a prerequisite for not losing them.

On the earned media plane you coordinate presence in industry publications, podcasts, analyst research, reviews on comparators like G2 or Capterra. This is where most of the citation gain is won. Refinea provides customers with the map of the publications most cited by AI engines in their category, reducing time spent guessing where to invest.

On the entity graph plane you build and maintain consistent presence on Wikipedia, Reddit, YouTube, LinkedIn, and other sources with high citation density. This does not mean spamming: it means being present where AI engines look and giving them quality material.

Phase 4: Brand Memory

Brand Memory is the Refinea module that catalogues the customer company’s unique facts in the form of Proof Points, Expert Voices, and Facts. It serves two purposes. The first is to allow generation of AI-optimized content that respects Google’s EEAT protocol (Experience, Expertise, Authoritativeness, Trustworthiness), because every claim produced is substantiated by a verifiable internal primary source. The second is to guarantee the cross-source narrative consistency discussed at point 7: having a single internal source of truth reduces the risk that the brand says different things on different channels.

What not to do in 2026

Three mistakes are still frequent and deserve to be named.

Optimizing the site for ChatGPT directly. ChatGPT does not crawl in real time during a conversation. It uses its own index, which is fed by OAI-SearchBot and third-party sources. Optimizing the site is useful, but the real terrain is what ChatGPT encounters in the sources it cites.

Buying visibility on low-authority vendor listings. Listings on bottom-tier directories produce backlinks that do not move AVI. AI engines ignore these sources in retrieval. Wasted money.

Measuring AI visibility with single prompts. A single ChatGPT query is not a measurement, it is an anecdote. You need multiple samples, curated prompts, repeated runs. The methodology section of Refinea Analysis describes the standard protocol.

What to do next week

For those who want to translate this guide into immediate action, three measurable steps over the next seven days.

Monday: identify the twenty most probable commercial prompts a real customer would use to search your category. Not the prompts you would want, the ones a user would actually use. If you have access to Google Search Console, start from the queries with the highest volume.

Wednesday: submit those twenty prompts to ChatGPT, Perplexity, Gemini, Claude, and Google AI Overviews. Note how many times the brand gets cited, which competitors get cited in its place, which sources the engines use.

Friday: the list of sources engines cite is your editorial plan for the next quarter. Knowing which publications, which subreddits, which Wikipedia pages actually influence answers on your category is worth more than any tool.

If this exercise takes too much time or you want to do it at scale, Refinea automates every step. But the value is not in the tool, it is in the framework. The framework works manually too for those who have patience.

Generative Engine Optimization: the 2026 operational guide