A multi-site automated publishing system for entity-first SEO, built by Dr. Sina Bari, MD.
Reputation Engine is the system I built to take control of my professional online presence. It coordinates AI-powered content generation, multi-site publishing, structured data optimization, and SERP monitoring across four owned domains - all orchestrated by autonomous agents running on n8n.
📖 Read the full article: How I Built a Personal Reputation Engine with AI Agents
- Narrative-led editorial voice for drsinabari.com (Jun 13) -- Long-form essays now write in the Malcolm Gladwell tradition: scene-first openings, withheld thesis, counterintuitive reframes, one coined named concept per essay (2-4 words) that recurs, and stacked cases that converge on a single principle. AEO answer-block suppressed on this site only; extraction value moves to the coined concept and the end-of-essay FAQ. Other three sites unchanged.
- Pre-push scan hardened (Jun 13) -- Adds OpenAI/HuggingFace/AWS/Google API key patterns, generic JWT detection, an
X-Voice-Keyliteral check, and a generalized high-entropy base64-ish blob scanner with a# SAFE-B64allowlist for known-safe content hashes. - measure.py SSL hardening (Jun 13) -- Removed insecure
CERT_NONESSL context; cert verification now enforced on the BrightData / GSC measurement script. - sync_pending_actions label updates (Jun 13) -- Daily Todo reconciler detects when a tracked todo's label text has drifted against the same
todo_idand updates the line in place rather than leaving stale wording.
See CHANGELOG.md for the full changelog.
Physicians and professionals often discover that Google results for their name include outdated, inaccurate, or context-free information. You can't remove those results, but you can build enough high-quality, authoritative content to occupy the visible SERP yourself.
That's what Reputation Engine does - systematically.
Instead of a single website competing for one slot, I run four purpose-built domains, each targeting a different facet of my professional identity. An autonomous agent pipeline researches topics, generates content, validates SEO quality, publishes articles, and measures the impact - on a weekly schedule, with human oversight at every stage.
┌─────────────────────────────────────────────────────────┐
│ Portfolio Orchestrator │
│ (scheduling, cadence, dispatch, auto-publish) │
└──────────┬──────────┬──────────┬──────────┬─────────────┘
│ │ │ │
┌─────▼───┐ ┌───▼────┐ ┌──▼───┐ ┌───▼────────┐
│ Research │ │Content │ │ QA │ │ Publisher │
│ Agent │ │ Agent │ │Agent │ │ Agent │
└─────────┘ └────────┘ └──────┘ └────────────┘
│ │ │ │
┌─────▼───┐ ┌───▼────┐ ┌──▼───┐ ┌───▼────────┐
│ SEO │ │ Media │ │Meas. │ │ Technical │
│Research │ │Ingest │ │Agent │ │ SEO Agent │
└─────────┘ └────────┘ └──────┘ └────────────┘
| Domain | Role | Content Focus |
|---|---|---|
| sinabarimd.com | Canonical identity hub | Bio, work, media, selected writing |
| sinabari.net | Healthcare AI authority | Healthcare AI analysis, health tech, digital health |
| drsinabari.com | Editorial node | Medicine & technology essays, clinical ethics, healthcare policy |
| sinabariplasticsurgery.com | Specialty node | Aesthetics, aging, rejuvenation, surgery |
Every agent is a standalone n8n workflow with a single responsibility:
-
Portfolio Orchestrator - The scheduling brain. Runs per-site cron jobs, checks publishing cadence, auto-publishes approved drafts, and dispatches content generation when the queue is empty.
-
Content Research Agent - Runs weekly topic scouting (Phase 1) using web search APIs, then deep research (Phase 2) when an operator selects a topic. Supports file attachments (PDFs, papers) for research context.
-
Content Generator - Takes a research brief and site profile, generates a structured draft via LLM, and stores it for human review. Enforces per-site word counts, tone, and forbidden topics.
-
Content Publisher - A 20-node pipeline that fetches the approved draft, renders the article page with full SEO metadata and structured data, updates the homepage, generates sitemaps, deploys via a deterministic file-sync service, and triggers QA.
-
SEO QA Agent - Three-level validation (article, domain, portfolio). Checks structured data, meta tags, internal linking, content quality. Runs automatically after every publish.
-
SEO Research Agent - Weekly intelligence brief analyzing SERP trends, competitor movements, and keyword opportunities across all four domains.
-
Technical SEO Implementer - Converts SEO research briefs into actionable tasks with an approve/dismiss/execute workflow.
-
Media Ingestion Agent - Monitors the web for mentions of my name, classifies them, and queues relevant items for the press page.
-
Measurement Agent - Tracks SERP positions using residential proxy searches and Google Search Console data. Monitors for negative results and generates alerts.
-
Site Refresh - Operator-triggered full page regeneration for design updates (used carefully - it's a destructive operation).
| Component | Technology | Why |
|---|---|---|
| Orchestration | n8n (self-hosted) | Visual workflow builder, webhook-native, good API |
| Content Generation | OpenClaw (self-hosted LLM gateway) | Full control over prompts, model swapping, no vendor lock-in, all data stays on-prem |
| Web Research | Tavily API | Purpose-built for AI research, good relevance |
| SERP Monitoring | BrightData residential SERP API | Accurate residential-IP search results |
| Search Analytics | Google Search Console API | First-party click/impression data |
| Hosting | Static HTML + nginx + Traefik | Fast, simple, deterministic deploys |
| Deploy | Custom Python deploy service (port 9911) | Full-file-sync model, atomic deploys |
| Text Extraction | Custom Python service (port 9913) | PDF/DOCX/TXT → plain text for research attachments |
| Site Design | Google Stitch via MCP | AI-generated site designs, connected through Model Context Protocol |
| Development - Design | Claude Cowork | Architecture planning, spec writing, brainstorming |
| Development - Build | Claude Code | Live API calls, coding, deployment, debugging |
| Infrastructure | Single VPS + Docker | n8n in container, host services via Docker bridge |
n8n runs inside Docker and can't execute host commands directly. The solution: lightweight Python HTTP services on the host, managed by systemd, firewalled to only accept connections from the Docker bridge subnet. Each is a single Python file using http.server. When n8n needs host-level capabilities (file deploy, text extraction, etc.), it makes an HTTP call to host.docker.internal:{port}.
This entire system was designed in Claude Cowork and built with Claude Code. Cowork handles the thinking - architecture, specs, strategy, brainstorming. Claude Code handles the doing - live n8n API calls, writing code, deploying changes, debugging production issues. They share the same project folder, so a spec file drafted in Cowork is immediately available for Code to implement.
A 500-line CLAUDE.md file in the project root acts as institutional memory - complete API reference, workflow IDs, webhook endpoints, architectural rules, and deployment procedures. Every Claude Code session reads it automatically, starting with full system context.
The system uses a form of Reinforcement Learning from AI Feedback (RLAiF) to improve published content quality over time. It works in two layers:
Layer 1 (Deterministic) runs a rules-based check on every published article, scanning for known AI content tells: banned generic phrases, em-dashes, missing first-person clinical voice, weak specificity signals, insufficient outbound authority links, and structural tells like hedge openers. This runs as a Code node in the SEO QA Agent, costs nothing, and executes in milliseconds.
Layer 2 (Model-Based) sends each article through a 5-dimension editorial rubric via three independent LLM passes, then aggregates scores with confidence tracking. The dimensions - first_hand_expertise, information_gain, specificity_evidence, depth_substance, voice_authenticity - evaluate the holistic signals that deterministic checks can't measure. This is advisory only and never gates a deploy.
The feedback loop: articles are published, automatically graded by both layers, and the results surface in the operator dashboard with per-dimension scores and suggested fixes. The operator rewrites weak articles using the grading feedback, redeploys, and regrades. Each cycle produces measurable score deltas that identify which editorial tactics have the highest impact per dimension.
After two rewrite cycles across 12 articles, the system produced a ranked playbook of editorial interventions. The highest-impact tactics: named citations with quantitative findings (+1.3-2.4 on specificity), opening clinical anecdotes with patient-specific detail (+2.0 on expertise), and quoted patient dialogue (+1.0-1.4 on expertise/voice). These findings feed back into the Content Generator's prompt engineering, closing the loop between evaluation and generation.
The approach treats content quality as an empirical optimization problem rather than a subjective editorial judgment. Every rewrite is an experiment with a measurable outcome.
Example: Scrolling News Ticker. The sinabarimd.com homepage has a scrolling news ticker showing recent media mentions. It went from idea → design spec (Cowork) → working component deployed to production (Claude Code) in a single session. That's the kind of iteration speed this workflow enables for a non-engineer.
The deploy service does a full file sync - every deploy lists exactly which files should exist, and anything not in the list is removed. This makes deploys completely deterministic: you always know exactly what's live. No database, no plugins, no security surface. The tradeoff is that you need a rendering pipeline, which the Content Publisher handles.
Each domain builds its own authority and competes for its own SERP slot. Subdomains of a single domain would consolidate ranking power but only occupy one result. The goal is to own as many page-one results as possible for branded queries.
Every draft goes through human review before publishing. The operator can edit titles, excerpts, full content, and even reroute articles to a different site. Auto-publish only fires for drafts that have been explicitly approved. This is a reputation system - accuracy matters more than speed.
Each agent has exactly one job. The Content Generator doesn't know about SEO scores. The QA Agent doesn't generate content. The Measurement Agent doesn't publish anything. This makes the system debuggable, testable, and safe to modify - changing one agent never breaks another.
reputation-engine/
├── README.md # You are here
├── LICENSE # MIT
├── workflows/ # All 10 n8n workflow JSONs (sanitized)
│ ├── portfolio-orchestrator.json # Scheduling brain (54 nodes)
│ ├── content-research-agent.json # Topic scout + deep research (43 nodes)
│ ├── content-generator.json # Draft generation via LLM (30 nodes)
│ ├── content-publisher.json # 20-node article pipeline
│ ├── seo-qa-agent.json # 3-level SEO validation (28 nodes)
│ ├── seo-research-agent.json # Weekly intelligence brief (14 nodes)
│ ├── technical-seo-implementer.json # Brief-to-tasks pipeline (22 nodes)
│ ├── media-ingestion-agent.json # Media monitoring (18 nodes)
│ ├── measurement-agent.json # SERP + GSC tracking (28 nodes)
│ └── site-refresh.json # Full page regen (35 nodes)
├── dashboard.html # Operator dashboard (3,350 lines, 8 tabs)
├── deploy/
│ └── deploy_service.py # Deterministic file-sync deploy service
├── services/
│ └── deep-researcher-api.py # Async academic paper research + n8n callback
├── scripts/
│ └── backup.sh # Full system backup (workflows + sites + state)
├── profiles/
│ ├── sinabarimd_com.yaml # Site profile - canonical hub
│ ├── sinabari_net.yaml # Site profile - healthcare AI
│ ├── drsinabari_com.yaml # Site profile - editorial
│ └── sinabariplasticsurgery_com.yaml # Site profile - specialty
├── qa/
│ └── qa_checks.js # SEO QA validation logic (n8n Code node)
├── schema/
│ ├── homepage_person.json # Person+Physician structured data
│ ├── article_schema.json # Article page structured data template
│ └── faq_extractor.js # Auto-extracts FAQPage schema from HTML
├── templates/
│ └── article_meta.html # Article page SEO meta template
├── measurement/
│ └── measure.py # GSC data collection via service account
└── docs/
├── architecture.md # Detailed architecture documentation
└── publishing-pipeline.md # The 20-node publish flow
Each domain has a YAML profile that controls content generation, publishing cadence, and SEO settings:
site_id: sinabari_net
domain: sinabari.net
name: "Sina Bari, MD - Healthcare AI Analysis"
role: "Healthcare AI authority site"
author:
name: "Dr. Sina Bari, MD"
url: "https://sinabarimd.com/about"
content:
allowed_topics:
- healthcare AI
- medical technology
- digital health
- precision medicine
forbidden_topics:
- plastic surgery
- reconstructive surgery
- generic AI
default_word_count: 1200
tone: "analytical, evidence-based, first-person clinical perspective"
publishing:
min_days_between_publishes: 3
pipeline_section: "ANALYSIS"
cron_days: [tuesday, friday]
seo:
schema_type: "WebSite"
canonical_hub_link: true
author_id: "https://sinabarimd.com/#sinabari"The deploy service is a simple Python HTTP server that receives a file manifest and atomically syncs a site directory:
# Simplified - see deploy/deploy_service.py for the full implementation
def handle_deploy(request):
payload = request.json
domain = payload['domain']
deploy_path = payload['deployPath']
files = payload['files']
# Write all files from the manifest
for file_entry in files:
path = os.path.join(deploy_path, file_entry['path'])
os.makedirs(os.path.dirname(path), exist_ok=True)
with open(path, 'w') as f:
f.write(file_entry['content'])
# Remove any files NOT in the manifest (full sync)
manifest_paths = {f['path'] for f in files}
for existing in walk_directory(deploy_path):
if existing not in manifest_paths:
os.remove(os.path.join(deploy_path, existing))
return {"success": True, "files_written": len(files)}After 5 weeks of operation (as of late April 2026):
- 4 owned domains ranking on page 1 for branded queries
- 10+ articles published across all sites with automated QA
- Structured data (Person, Physician, Article, FAQPage) deployed on every page
- Web 2.0 syndication across 15+ platforms with no-repeat tracking
- Zero manual deploys -- everything goes through the pipeline
- Deep academic research -- papers indexed and synthesized into content briefs
- Operator dashboard -- single-page control plane with 8 tabs, daily todos, inline actions
This repository is a reference implementation. To adapt it for your own use:
- Set up n8n - self-hosted instance with API access
- Define your domains - what facets of your identity do you want to represent?
- Create site profiles - YAML configs that control content and publishing rules
- Set up the deploy service - or adapt to your hosting (Netlify, Vercel, S3, etc.)
- Connect an LLM - OpenClaw, OpenAI, Anthropic, or any compatible API
- Build incrementally - start with one site, add agents as you go
Dr. Sina Bari, MD Physician · Healthcare AI · Medical Technology
- 🌐 sinabarimd.com
- 🏥 sinabari.net - Healthcare AI Analysis
- ✍️ drsinabari.com - Essays on Medicine & Technology
MIT - see LICENSE for details.
This is a reference implementation of the system described in How I Built a Personal Reputation Engine with AI Agents.