3 GEO Experiments Worth Trying This Year

3 GEO Experiments Worth Trying This Year

14 minutes

Table of contents

Most AI-generated brand descriptions are still inaccurate. Small, reversible tests help reveal what LLM models actually extract from your content.

Last month, I asked ChatGPT, Perplexity, and Gemini the same question about three clients: “Who is [Brand Name] and what do they do?”

Two out of three models answered incorrectly. Wrong services. Outdated office locations. One model even suggested a competitor as a better alternative.

And this is much more than just a curious mistake.

Read more about GEO Content Audit.

What AI Says About Brand Visibility

Traffic from AI-based sources grew by 527% year over year—from early 2024 to early 2025.

While the growth is real, it starts from a very low base. For most websites, referral traffic from AI systems still accounts for less than 1% of total traffic.

But if half of AI-generated descriptions of your brand are inaccurate, this is not a future problem. It is shaping perception right now. The question isn’t whether you should optimize for AI systems, but how to do it effectively. We are trying to understand what truly works and what is simply repackaged SEO fundamentals being marketed as revolutionary tactics.

And unlike traditional SEO—where you can forecast traffic and revenue with a certain level of confidence—AI-driven search does not work this way. You cannot sell certainty here. You can only sell managed learning.

Most GEO tactics turn out to be SEO basics applied to a new layer of visibility.

Structure, clarity, and consistent information have always been important. Now these principles affect not only how users find and read your content, but also how AI systems summarize and cite it.

The only way to separate truth from assumptions is to run small, reversible experiments that produce data strong enough to support decision-making.

The cost of not knowing what works is higher today than the cost of finding out.

Below are three GEO experiments that help you understand how AI reads, summarizes, and reuses your content. These are practical tests that most teams can execute in 60–90 days, each offering clear insight into whether these tactics actually impact your business.

Read more about Geoptie’s free tools for SEO professionals.

Experiment 1: Build a Topic Cluster That’s LLM-Ready

Marketers have worked with content clusters for years. But GEO changes the rules. Generative systems do not read content the same way humans do. They break it into semantic units, look for clearly defined entities, straightforward answers, consistent language, and predictable structure. When the entire cluster is organized this way, AI systems find it easier to understand and cite you as a reliable source.

The first experiment tests exactly this.

Choose a Cluster With Business Value

Pick a topic where you already have strong content or where visibility is strategically important.

Use onsite search data, Google Search Console queries, and support team inquiries—these reveal the real questions your audience asks.

Often, they are the same questions potential clients ask in LLM systems.

Tip: if your support team hears the same question three times a week, that’s a signal.

Create (or Rebuild) the Cluster for Machine Readability

Here’s what works best based on testing.

Build Structure Around Natural Questions

Your H2s should reflect how people actually phrase queries:

  • “What is [topic]?”
  • “How much does [topic] cost?”
  • “Which option is best for beginners?”
  • “What should you avoid?”

AI tools prefer pages that answer questions the way users formulate them—not the way we would like them to.

Use a “Summary-First” Approach

The first 100–150 words should provide a quick, clear overview.

No slow introductions.
No warm-up stories.
No clichés like “In today’s fast-paced digital world…”

Use a Q&A Format as the Standard

Structure each page as follows:

  • Question
  • Short answer (1–2 sentences)
  • Details (2–3 paragraphs)
  • Optional: a table or list

This format is ideal for LLMs. It clearly signals where information is located.

Do Not Ignore Schema and Internal Links

Use relevant schema types: FAQPage, HowTo, Product, Organization, LocalBusiness, etc.

Use internal links to establish a clear cluster hierarchy so models don’t have to guess which page answers which question.

Measure the Right Metrics

Over 60 days, track:

  • AI Overview appearances for target queries (incognito checks twice a week or tools like Semrush)
  • Citation patterns: Do ChatGPT, Gemini, or Perplexity reference your website? How accurately?
  • Organic traffic and conversions within the cluster
  • Description consistency: Do models describe your content the same way?

Compare Results With a Control Group

This is critical: compare the improved cluster with another cluster you did not optimize.

If the LLM-ready cluster gains more AI Overview visibility, produces more accurate answers, and maintains stable organic traffic—you’ve found a tactic worth scaling.

Example

I rebuilt a dental cluster on the topic of “teeth whitening options.”
Within 75 days, the site appeared in AI Overviews for 9 out of 13 target queries (previously 2).
Traditional organic traffic remained stable, but visibility in AI-generated answers increased significantly.

Why This Works (and Not Only for AI)

The same structural improvements that help AI systems read your content more effectively typically improve traditional SEO as well.

Clear headings, direct answers, and logical content organization help Google index your pages more efficiently.

Users also appreciate clarity. Faster access to answers often correlates with better on-page engagement.

Even if AI traffic is still small, you are building content that performs better across all channels.

Experiment 2: Run a Sprint to Improve Brand Entity and Sentiment

One of the key problems of modern generative models is their inability to convey nuance. If brand information is presented inconsistently or contradictorily, LLMs fill in the gaps with random assumptions. Moreover, they can confidently provide users with completely incorrect data about a brand.

LLMs form their understanding of a company based on various sources:

  • reviews on Google, Yelp, Trustpilot, and other platforms;
  • business directories and catalogs;
  • editorial mentions in the media;
  • discussions on Reddit and industry forums;
  • company profiles on social media;
  • structured data on the website (schema markup);
  • knowledge graph–related sources — including Wikidata or Crunchbase.

All these signals together form the “brand story” that AI reproduces in its responses. If the story is presented unevenly, models combine fragmented or outdated data into a distorted brand narrative.

This is exactly the problem the second experiment solves.

Audit What AI Already “Knows” About the Brand

The first step is to analyze the current state. To do this, you need to ask ChatGPT, Gemini, and Perplexity basic questions:

  • “Who is [Brand Name]?”
  • “What does [Brand] offer?”
  • “Is [Brand] suitable for [a specific use case]?”
  • “What are alternatives to [Brand]?”

During analysis, you should document:

  • accuracy of the description;
  • general sentiment (positive, neutral, negative);
  • sources cited;
  • competitors mentioned;
  • any inaccuracies, outdated data, or fabricated facts.

This forms the baseline for further comparison. It is recommended to take screenshots and save them.

Unify and Clean All Core Brand Signals

The next step is aligning information across all key channels. If data is scattered, models will merge it according to their own logic, creating a distorted brand profile.

The most important areas to update:

1. Optimization on the Website

  • Update the homepage and the “About Us” page with clear statements about what the company does, which regions it serves, who its customers are, and what its key advantages are.
  • Add up-to-date Organization and LocalBusiness schema.
  • Consolidate or remove duplicate pages that may confuse the models.

2. Alignment of External Sources

  • Synchronize information in business directories, including categories, descriptions, and contact details.
  • Encourage customers to leave detailed reviews — LLMs consider specificity, not just star ratings.
  • Ensure the appearance of relevant editorial mentions in industry media.

3. Activity in Communities

  • Participate in professional forums and topic-specific discussions, including on Reddit.
  • Models often use such sources to evaluate reputation and expertise.

Re-Measure Results

After 60–90 days, the initial queries to the models should be repeated. The comparison should show changes in:

  • accuracy of descriptions;
  • sentiment;
  • brand positioning in list-style responses;
  • frequency of mentions;
  • correctness of reproduced products, services, markets, or geographies.

Identify Which Signals Drove the Results

In different projects, different factors play a key role. Sometimes the biggest effect comes from updating business profiles, sometimes — from detailed reviews, and in other cases — from topical publications on authoritative sites.
The goal of the experiment is to understand which specific signals most strongly influence brand representation in AI systems and use this to build a scalable operating model.

Example

A regional HVAC services company faced a situation in which AI systems described it primarily as a provider for residential clients, despite the majority of its revenue coming from the commercial segment. After updating the Google Business Profile, homepage, and key directories with a clear emphasis on commercial services, the models began to correctly reflect this segment in less than 70 days.

Why This Approach Works

Although the techniques may seem familiar from local SEO, their impact has greatly expanded. The same signals that determine local visibility now shape the brand image in large language models. LLMs aggregate information from many sources, so consistency and accuracy become critically important for correct brand positioning.

The strategy does not require mastering new disciplines — only the systematic application of proven tools within the context of the modern AI environment.

Read more about Google’s SEO tips for better rankings.

Experiment 3: Testing Summary Formats for Machine Readability

As the development of generative systems accelerates, their dependence on concise, clear, and easily understandable summaries grows. Artificial intelligence models heavily rely on the first 150 words of content. If this introduction is unclear, overloaded with descriptive elements, or contains excessive narrative structure, the models may skip the page or misinterpret its content.

This experiment aims to determine which summary format increases content visibility in AI systems and ensures more accurate interpretation when cited.

Three Formats to Tes1. Short Bulleted Summaries

These are optimally suited for:

  • definitions;
  • process descriptions;
  • pricing structures;
  • lists of advantages and disadvantages;
  • comparative characteristics.

Example of a short summary:

  • Price range: $1,500–$5,000.
  • Best for: small businesses with 10–50 employees.
  • Implementation timeline: 2–4 weeks for full deployment.
  • Alternatives: in-house tools, freelance consultants.

2. Concise Paragraph Summaries

These are two- or three-sentence explanations—focused, clear, and informative.

Example:
“The service typically costs between $1,500 and $5,000 depending on business size and the level of customization. Most small companies with 10–50 employees achieve full implementation within 2–4 weeks. Alternatives include in-house tools and freelance consultants, though they usually require more operational oversight.”

3. Narrative Introductions

A traditional search-optimization approach that introduces a topic through a story or descriptive context. Generative systems often skip such introductions, making it advisable to test the impact of removing them on the frequency of appearing in AI Overview.

Where to Apply the Testing

The formats should be tested on the following page types and materials:

  • guides and tutorials;
  • rankings and listicles (“best of”);
  • service pages;
  • pricing pages;
  • content with extensive Q&A sections;
  • any materials where clarity is critical and where AI systems are highly likely to extract answers from these pages.

Metrics to Track Over 60 Days

  • page appearances in AI Overview depending on the summary format;
  • paraphrasing accuracy: whether the models correctly convey the meaning of the summary;
  • user behavior metrics: scroll depth, time on page, bounce rate;
  • conversions: whether users appreciate this level of clarity as much as AI systems do.

Success Indicators

Testing should reveal which summary format provides:

  • better presence in generative responses,
  • more accurate interpretation of material by AI models,
  • stronger user engagement from audiences who prefer structured and easy-to-understand content.

Example of Application

An e-commerce company tested short bulleted summaries against traditional narrative introductions across 20 product category pages. Pages with bulleted summaries appeared in AI Overview three times more often and demonstrated a 22% higher click-through rate from organic search. This confirmed that clarity matters not only for machines but also for users.

How to Conduct GEO-Testing in a Mini-Program Format

For most marketers, the most effective model is one with a duration of 60–90 days. This approach allows experiments to be run on a small scale, ensures reversibility, and at the same time provides enough data to generate meaningful insights. Each experiment is treated as a pilot project — a limited-scope hypothesis designed to generate new understanding, rather than a large strategic initiative requiring significant resources.

The optimal workflow may look like this:

Weeks 1–2: Establishing the Baseline

  • recording the presence of pages in AI Overview for target queries;
  • collecting LLM responses and evaluating entity-level accuracy;
  • documenting the overall tone of answers and the frequency of competitor mentions;
  • gathering baseline organic metrics (traffic, conversions, engagement).

Weeks 3–6: Execution

  • restructuring the content cluster into an LLM-oriented structure;
  • cleaning and aligning brand-entity signals and business listings;
  • implementing new summary formats;
  • updating schema markup and internal linking patterns.

Weeks 7–12: Measurement

  • comparing AI visibility metrics before and after the experiment;
  • analyzing changes in mentions, citations, or inclusions within model responses;
  • evaluating user behavior metrics to confirm impact;
  • documenting effective and ineffective tactics.

This approach is easily scalable and provides clarity instead of assumptions. Each completed test either confirms the effectiveness of a certain tactic (meaning it should be scaled) or shows insufficient impact (meaning no further investment is needed in that direction).

Mistakes to Avoid: Practical Lessons

Across experiments conducted with different companies, recurring patterns emerge that highlight which actions fail to produce results or introduce unnecessary risks.

Avoid manipulating content to improve AI extraction

Some marketers test hidden text or cloaking targeted at bots. Even if this produces a short-term effect, platforms quickly improve spam detection mechanisms. Similar patterns have been observed in SEO history: early manipulations work until they become widely blocked.

Do not implement several major changes simultaneously

If you restructure a cluster, update business profiles, and change summary formats at the same time, it becomes impossible to understand which factor influenced the outcome. Each variable must be tested independently to produce valid conclusions.

Do not assume AI systems automatically interpret the brand correctly

Models aggregate the information they find in public sources. It is the company’s responsibility to ensure this information is consistent, reliable, and present across all relevant touchpoints.

Align investments with actual impact

Despite the growing role of AI-based search, for most companies it still accounts for only a small share of overall traffic. Therefore, it is crucial to test, measure effectiveness, and invest according to real results—not hypothetical predictions. If experiments generate meaningful impact, the tactic should be scaled; if not, the insights remain valuable, and costs stay controlled.

What GEO-Tests Actually Deliver

The advantage of such experiments is that they create controlled conditions for gaining new knowledge without requiring large traffic volumes. Even when the share of AI-search remains small, improvements in content structure, consistency of brand information, and the quality of summary formats positively affect traditional search performance.

Focusing on fundamental principles delivers stable results: clear, well-structured, and useful content performs well regardless of how search technologies evolve.

The true value of these experiments is not in guaranteed traffic gains from AI platforms but in answering essential business questions, such as:

  • whether AI systems interpret the brand correctly;
  • whether structured content improves visibility across channels;
  • whether there are quick wins in cleaning and aligning entity signals;
  • which summary formats are most effective for both AI models and users.

These three experiments form a starting framework accessible to most teams and provide practical insights needed for informed decision-making. They remain compact, clearly measurable, and deliver cumulative value regardless of how quickly AI-search adoption grows.

Read this article in Ukrainian.

Digital marketing puzzles making your head spin?


Say hello to us!
A leading global agency in Clutch's top-15, we've been mastering the digital space since 2004. With 9000+ projects delivered in 65 countries, our expertise is unparalleled.
Let's conquer challenges together!



Hot articles

6 Underrated Ways to Repurpose Content

6 Underrated Ways to Repurpose Content

How Display & Video 360 Will Evolve by the End of 2025

How Display & Video 360 Will Evolve by the End of 2025

Google Business Profile adds post scheduling and multi-location publishing

Google Business Profile adds post scheduling and multi-location publishing

Read more

Why SEO strategy is critical for business

Why SEO strategy is critical for business

7 Key SEO Benefits Businesses Get from Google Keyword Reviews

7 Key SEO Benefits Businesses Get from Google Keyword Reviews

Overcoming Skepticism in Brand Mention and Link Building Campaigns

Overcoming Skepticism in Brand Mention and Link Building Campaigns

performance_marketing_engineers/

performance_marketing_engineers/

performance_marketing_engineers/

performance_marketing_engineers/

performance_marketing_engineers/

performance_marketing_engineers/

performance_marketing_engineers/

performance_marketing_engineers/