I built this website to be aggressively minimal. The original base template was 34 lines of HTML. No JavaScript frameworks. No CSS libraries. Just semantic HTML, a single stylesheet inspired by Motherfucking Website and Better Motherfucking Website, and content served as fast as static files can travel.
Then I looked at my Google Search Console data and realised: my visitors had halved overnight.
The Wake-Up Call
I’d made what I thought were minor changes to the site structure. Reorganised some templates, cleaned up URLs, the usual maintenance. Within 24 hours, my traffic dropped by 50%. Google Search Console showed impressions falling off a cliff. Pages that had ranked on the first page were suddenly nowhere to be found.
What happened? I’d broken the implicit contract with search engines. Changed URLs without redirects. Removed metadata Google was relying on. Restructured content without updating the sitemap. The site still worked perfectly for humans. For crawlers, it was chaos.
This was my wake-up call: minimal doesn’t mean you can ignore the machines.
The Minimalist’s Dilemma
There’s a seductive purity to minimal websites. Every tag serves a purpose. Every line of CSS earns its place. You’re not shipping megabytes of JavaScript to render a blog post. You’re not waiting for hydration. The browser receives HTML and renders it. Done.
But here’s what I didn’t appreciate: the web isn’t just browsers anymore. It’s crawlers, scrapers, AI models, social media preview generators, RSS readers, and a thousand other machines trying to understand what your content means. And they can’t read between the lines.
A human looks at my homepage and understands: this is a Gökhan Arkan’s personal site, he works at this place, studies at that place, writes about software engineering and side projects. The context is obvious from the layout, the tone, the content.
Google sees: some text, some links, no structured data, no clear topic hierarchy, no explicit relationships between entities. It indexes the words but misses the meaning.
The Technical Stack for Discoverability
I spent some hours researching what modern SEO actually requires. Not the spam tactics or keyword stuffing from 2010, but the technical foundations that help search engines understand content.
robots.txt
The first file any crawler looks for. It tells bots what they can and can’t access:
User-agent: *
Allow: /
Sitemap: https://gokhanarkan.com/sitemap.xml
Simple, but essential. Without it, crawlers make assumptions. Some might respect a missing robots.txt as “crawl everything,” others might be more conservative. Being explicit removes ambiguity.
sitemap.xml
The sitemap is your site’s table of contents for search engines. Mine now includes priorities and change frequencies:
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://gokhanarkan.com/</loc>
<lastmod>2025-12-26</lastmod>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://gokhanarkan.com/blog/</loc>
<changefreq>weekly</changefreq>
<priority>0.9</priority>
</url>
<!-- Individual posts with priority 0.8 -->
<!-- Topic pages with priority 0.7 -->
</urlset>
The priority values (0.0 to 1.0) tell search engines which pages matter most. The changefreq hints at how often to re-crawl. Whether Google actually uses these is debated, but they’re part of the standard and cost nothing to include.
llms.txt
This is the new frontier. As AI models increasingly crawl the web for training data and retrieval-augmented generation, a new standard is emerging: llms.txt. It’s like robots.txt, but for language models.
# Gökhan Arkan
> Software engineer at GitHub Copilot, postgraduate at Oxford CS.
## Blog Posts
- [UTM Manager](/blog/utm-manager/): Building a lightweight marketing attribution library
- [NestScore](/blog/nestscore-property-evaluation/): Property evaluation tool for London house hunters
- [Software Engineering in 2026](/blog/software-engineering-2026/): My take on AI coding tools
The format is simple: a brief description of the site, followed by a structured list of content with summaries. It helps AI models understand what your site is about without crawling every page. I generate mine automatically from Hugo, excluding pages I don’t want in AI training sets.
Google Search Console
Search Console is where you see how Google actually perceives your site. After my traffic crash, it showed me:
- Coverage errors: Pages Google couldn’t index
- Mobile usability issues: Problems on smaller screens
- Core Web Vitals: Performance metrics that affect ranking
- Manual actions: Penalties for violating guidelines (thankfully none)
More importantly, it shows what queries bring people to your site. I discovered my most popular page wasn’t what I expected. The data shapes what content to create next.
Structured Data (JSON-LD)
Google’s documentation is clear: structured data helps them “understand the content of the page” and enables “special search result features.” Without it, you’re asking algorithms to infer what humans grasp instantly.
For a personal site with a blog, the relevant schemas are:
Person- who you are, your social profiles, where you workWebSite- the site itself, its name, publisherBlogPosting- each article with author, date, word countBreadcrumbList- navigation hierarchy
This isn’t optional if you want rich results. Google can display your articles with author info, reading time, and proper attribution. Without the schema, you’re just another blue link.
Open Graph and Twitter Cards
When someone shares your article on LinkedIn or Twitter, what shows up? Without Open Graph tags, the platform guesses. Usually badly. You get a generic title, no image, a truncated description that cuts off mid-sentence.
With proper og:title, og:description, og:image, and Twitter Card meta tags, you control the preview. Your content looks intentional, not accidental.
The Implementation
I restructured the entire template system. The original 34 lines became a modular partial system:
layouts/
├── partials/
│ ├── head/
│ │ ├── meta.html # OG, Twitter, description
│ │ ├── resources.html # CSS, fonts, preload
│ │ └── structured-data.html # JSON-LD schemas
│ ├── components/
│ │ ├── breadcrumbs.html
│ │ ├── topic-links.html
│ │ ├── related-posts.html
│ │ └── reading-time.html
│ ├── header.html
│ └── footer.html
└── taxonomy/
├── topic.html # Individual topic pages
└── topic.terms.html # All topics index
The base template went from 34 lines to 17, but it now pulls in partials that handle all the SEO machinery. The actual HTML output grew significantly, but it’s generated at build time. Users still get fast, minimal pages. The complexity is in the build, not the delivery.
Programmatic SEO
Every topic tag now generates its own page at /blog/topics/{topic}/. These pages aggregate all posts with that tag, provide unique titles and descriptions, and create internal linking opportunities. A post tagged “javascript” automatically links to /blog/topics/javascript/, which lists all JavaScript posts.
This is “programmatic SEO” - generating pages from data rather than writing each one manually. For a personal blog it might seem excessive. But each topic page is a potential search entry point. Someone searching for “javascript side projects” might land on my topic page rather than an individual post.
Related Posts
Each blog post now shows up to three related posts based on shared topics. This keeps readers on the site longer (which search engines notice) and creates additional internal links (which search engines follow).
The Uncomfortable Truth
Here’s what I didn’t want to admit: you can build the most elegant, minimal, fast-loading website in the world, and it doesn’t matter if nobody finds it.
SEO isn’t about gaming algorithms. It’s about speaking the language that machines understand. Structured data is just a formal way of saying what humans already infer from context. Open Graph tags are just explicit instructions for preview generators. Internal linking is just making connections visible.
The minimal philosophy remains intact: every addition serves a purpose, every tag earns its place. But the purpose expanded from “render correctly in a browser” to “be understood by the entire ecosystem of web consumers.”
Bending the Knee
There’s no way around it: if you want people to find your content through search, you play by Google’s rules. You can philosophically disagree with the complexity. You can wish the web was simpler. But pragmatically, you either provide what search engines want or you remain invisible.
The irony isn’t lost on me. I built this site to reject the bloat of modern web development. And here I am, adding JSON-LD schemas, Open Graph tags, and XML sitemaps. The rendered page is still minimal. The source is not.
But I’ve made peace with it. The overhead is invisible to users. It’s generated at build time, not runtime. And it works.
What I Learned
Minimal doesn’t mean invisible. You can maintain a lightweight, fast site while providing the metadata machines need.
Structured data is documentation for robots. Just like code comments help humans understand intent, JSON-LD helps crawlers understand content.
Internal linking is architecture. A website isn’t a folder of documents; it’s a graph of relationships. Make those relationships explicit.
Build complexity, deploy simplicity. All the SEO machinery runs at build time. Users get static HTML. The complexity is in the tooling, not the experience.
Google’s rules aren’t arbitrary. Most SEO best practices are genuinely helpful: clear structure, explicit metadata, accessible markup. Following them makes your site better for everyone, not just crawlers.
I still believe in minimal websites. But I’ve learned that “minimal” means removing what’s unnecessary, not removing what’s invisible. The structured data, the meta tags, the internal links; users never see them, but they’re essential for discoverability.
Sometimes you have to add invisible complexity to preserve visible simplicity. And sometimes, you have to bend the knee.