Clobsidian in Detail: Cross-Source Personal Infrastructure

08 Jan, 2026 - 07 Mins read

LLMs
PKM

I’ve spent the past few weeks building what I’m calling “Clobsidian” — a personal knowledge infrastructure that connects my Obsidian vaults, browser sessions, email, journal, and task manager into something Claude Code can query and synthesize.

There are 86 AI plugins for Obsidian now. Most do some variant of “chat with your vault” — RAG over your notes, semantic search, AI-assisted editing. That’s useful, but it treats your vault as the only data source. I wanted something different: cross-source synthesis. What patterns emerge when you combine what I wrote in my journal with what I actually did in my browser, what I said I’d do in emails, and what tasks I completed (or didn’t)?

The other common approach is giving Claude an OAuth token to Gmail, Google Calendar, etc. I don’t love that either. I’d rather Claude have read-only access to local copies of my data — no credentials that could delete my email or post on my behalf, no API tokens to exfiltrate.

So I built something in between: local data, multiple sources, synthesis across them. Here’s what I built and what I learned.

The Architecture

The system has three main pieces:

~/Documents/PARA-Vault/     # My Obsidian vault (read-only for Claude)
~/Documents/Claude-Vault/   # Claude's workspace (read-write)
~/data/                     # SQLite databases, email index

The PARA-Vault follows Tiago Forte’s PARA method — Projects, Areas, Resources, Archive. Claude can read everything here but writes nothing. All of Claude’s output goes to Claude-Vault, which is also an Obsidian vault so I can browse it with the same tools.

The ~/data/ directory holds synchronized copies of external services:

~/data/
├── email/                    # Maildir format, ~40 folders
│   ├── INBOX/
│   ├── Sent/
│   ├── Work/
│   ├── Personal/
│   └── .notmuch/             # Xapian search index
├── rosebud/
│   ├── rosebud-2026-01-01.md # Raw exports from app
│   └── rosebud.db            # SQLite with parsed entries
└── todo/
    ├── todo.db               # Tasks from Microsoft Graph API
    └── ms_auth_record.json   # OAuth token (not read by Claude)

Data Source Integration

Getting data into a format Claude can query was the first challenge. Here’s how each source works:

Email (mbsync + notmuch)

I sync Gmail locally with mbsync, then index with notmuch. This is more work than giving Claude an OAuth token to the Gmail API, but there’s a security advantage: Claude gets read-only access to a local copy.

The mbsync process runs separately (via cron or manual trigger), pulling emails into Maildir format. Claude Code can read and search these files but has no credentials to modify anything on Google’s servers. No “accidentally delete all my email” failure mode. No OAuth token that could be exfiltrated. Just plain text files on disk.

Notmuch provides fast full-text search over the Maildir. I filter out automated emails — newsletters, notifications, GitHub alerts — to focus on human correspondence. This gives me sender patterns, thread counts, and the ability to detect commitment phrases (“I’ll send you…”, “I will… by…”) for accountability tracking.

Rosebud Journal (Markdown → SQLite)

Rosebud is an AI journaling app. Their export is markdown, which is fine for reading but slow for temporal queries like “what did I write about last Tuesday?” I built an ingest script that parses entries into SQLite with tables for entries (by date and speaker) and person mentions.

The person_mentions table is interesting — I extract names from journal entries using regex, then validate them with a local LLM (Ollama running gemma3:27b-it-qat). This filters out false positives like “Party”, “Empty”, “Let”, “Obsidian” — words that look like names but aren’t. The LLM validation uses a simple prompt: “Is ‘X’ a person’s name? Answer yes or no.” Results are cached, so after the first pass it’s instant.

Vivaldi Browser Sessions

I use Vivaldi with workspaces — each workspace roughly corresponds to a PARA project or area. The session files are a binary/JSON hybrid that took some reverse-engineering to parse. Once decoded, I can answer questions like:

Which workspaces have I not touched in 2+ weeks?
Which have 30+ tabs (probably need triage)?
Are there workspaces with no corresponding PARA structure?

Microsoft To-Do (Graph API → SQLite)

Microsoft’s Graph API is well-documented but OAuth is always a hassle. Once authenticated, I pull tasks into SQLite with fields for title, due date, completion status, and list membership. The main use is cross-referencing: do my stated goals have corresponding tasks? Are there active tasks for things I said I’d do?

Clay.earth (MCP Server)

Clay is a personal CRM. They provide an MCP server, which means Claude Code can query it directly without me writing any integration code:

mcp__clay__searchContacts(query="people I met at conference", limit=10)
mcp__clay__getContact(contact_id=12345)

I use this primarily for enriching person mentions — when my journal mentions “Jan”, I can pull his job title, company, and last interaction from Clay.

The Skills

With data sources in place, I built “skills” — reusable workflows that Claude Code can execute. Each skill is a Python script plus a markdown file describing when and how to use it. I invoke them via Claude Code’s chat interface (“run the weekly reflection”) or directly from the command line. Here are the two I use most:

Weekly Reflection

This is the one I’m proudest of. It takes raw text from all sources — journal entries, task titles, email subjects, browser tab names, note snippets — and synthesizes them into a narrative reflection using Claude’s API.

The output looks like this:

Weekly Reflection — 2026-01-01

“This was a week of shipping under pressure. You pushed the evals.cz site live, kept the Czechitas curriculum on track, and started mulling over the AI Council CfP—but the pattern of ‘just one more thing’ before bed suggests the pace isn’t sustainable.”

The numbers say: 1,204 browser visits, 130 emails, 89% task completion. But what were you actually doing?

Themes of the Week: Teaching and side projects competing for attention

The AI Bootcamp’s “Small-Project Evals” module took priority—slides, exercises, example notebooks. But your browser tells a different story: 18 tabs about the AI Council conference in Oakland, their CfP page bookmarked but no draft started. You’re deciding whether to submit, which means you’re not actually writing.

Tensions: Shipping vs. polish

You deployed evals.cz on Tuesday with “good enough” copy, then spent Wednesday-Friday tweaking it. Your journal notes frustration: “I keep finding things to fix instead of promoting it.” The site is live but you haven’t told anyone. Launch without promotion isn’t really a launch.

Reflections for You

What would “done” look like for evals.cz—not perfect, but actually shipped and shared?

The AI Council CfP is open. Are you submitting or not? Decide by Friday so you can stop researching and start writing (or stop thinking about it).

The key insight: statistics tell you that you did things, but not what you were actually wrestling with. The reflection synthesizes across sources to surface themes, tensions, and questions I wouldn’t have noticed otherwise.

The implementation uses pydantic-ai for structured output. This is the part I’m most pleased with technically — Pydantic models define exactly what the reflection should contain, and the agent returns typed objects:

from typing import List
from pydantic import BaseModel, Field
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel

class Theme(BaseModel):
    """A theme that emerged from the week's data."""
    title: str = Field(description="Short title (3-7 words)")
    description: str = Field(description="Narrative with specific evidence from sources")
    sources: List[str] = Field(description="Which data sources support this theme")

class Tension(BaseModel):
    """A gap between intention and action."""
    description: str = Field(description="What the tension is")
    intention_source: str = Field(description="Where you stated this intention")
    reality_evidence: str = Field(description="What actually happened")

class WeeklyReflection(BaseModel):
    """Structured weekly reflection output."""
    opening_observation: str = Field(description="2-3 sentence synthesis")
    themes: List[Theme] = Field(description="2-4 themes with cross-source evidence")
    tensions: List[Tension] = Field(description="Intention vs reality gaps")
    reflection_questions: List[str] = Field(description="3-5 questions for the user")

# Create the agent with Claude Opus
model = AnthropicModel('claude-opus-4-5-20251101')
agent = Agent(
    model=model,
    output_type=WeeklyReflection,
    system_prompt="""You are analyzing a week of someone's life across multiple
    data sources: journal entries, tasks, emails, browser tabs, and notes.
    Synthesize patterns and surface what they might not notice themselves."""
)

# Run with all the extracted context
result = agent.run_sync(context_from_all_sources)
reflection: WeeklyReflection = result.output  # Fully typed!

The structured output ensures every reflection has themes, tensions, and questions — the sections I actually want. No more “the LLM forgot to include the questions this time.”

Intention ↔ Reality Gap

This one answers the question: am I actually making progress on the things I say I want?

It works by:

Parsing my yearly goals file (2025 Goals.md) for incomplete items
Extracting stated intentions from journal entries (“I want to…”, “I should…”, “I need to…”)
Searching for evidence across PARA vault, email, tasks, and browser activity
Flagging goals with no recent activity

The output:

Intention ↔ Reality Gap Analysis

Goals file: 2025 Goals.md · Year: 2025 · Goals checked: 5 · With evidence: 4 · Without evidence: 1

Stale/Neglected Goals (1) — Last activity > 90 days

Q2: Publish benchmark results to HuggingFace

Last activity: 142 days ago (4 months)

Keywords: benchmark, HuggingFace, evaluation, publish

Evidence: 1 project, 3 notes, 47 related emails

Recent Intentions from Journal — 4 in last 30 days:

[2025-12-18] reach out to three potential beta users for feedback

[2025-12-15] finish the evals.cz landing page copy

[2025-12-12] decide whether to submit to AI Council CfP

[2025-12-08] prep the Small-Project Evals module for AI Bootcamp

The keyword extraction uses Ollama to pull meaningful search terms from goal text, then searches each data source. A goal with “evidence” means I found related PARA notes, active tasks, or email activity. A goal without evidence is one I stated but never acted on.

What I Learned

SQLite for everything temporal. Querying “entries from last 7 days where speaker=‘simon’” is instant in SQLite, painful in markdown grep. The initial ingest takes work, but the query speed pays off immediately.

LLM validation for fuzzy extraction. Regex catches too many false positives when extracting names or keywords. A local LLM doing yes/no validation is cheap and accurate. Cache the results and you only pay once per unique input.

Structured output for synthesis. Free-form LLM output is inconsistent. Pydantic models ensure the reflection always has themes, tensions, and questions — the structure I actually want.

Cross-source correlation is where the value is. Any single data source tells a partial story. The interesting insights come from weaving them together: “You said you wanted X (journal), you have tabs open about X (browser), but no tasks or notes about X (PARA/To-Do) — maybe this is stuck?”

Graceful degradation. Each skill checks what data sources are available and proceeds with whatever exists. If email sync failed, the analysis still runs with Rosebud and PARA data. The output notes what’s missing rather than failing entirely.

What Doesn’t Work (Yet)

No mobile access. This is the big one. Everything runs on my Mac. I can’t check my weekly reflection from my phone, can’t quickly look up what I know about someone before a meeting unless I’m at my laptop. That’s a deliberate tradeoff — local data means no cloud sync — but it’s genuinely limiting. I’m considering a read-only web UI that runs locally, but haven’t built it yet.

The data pipeline is fragile. mbsync occasionally fails silently. Rosebud requires manual export (I haven’t automated the browser-based export yet). The Vivaldi session parser breaks when Vivaldi updates their format. I spend maybe 20 minutes a week fixing sync issues.

Person name extraction has edge cases. If you journal about people with names like “Will”, “April”, or “Paris”, the LLM validation helps but isn’t perfect. The cache makes re-validation instant, but new ambiguous names need occasional manual review.

The code lives in a private repo, but I’m happy to share specific patterns if you’re building something similar. The key dependencies are pydantic-ai for structured LLM output, notmuch for email indexing, and Claude Code’s MCP support for live API integrations like Clay.