We Built a Custom Graph RAG to Let AI Answer "Did That Initiative Actually Work?"
Hi, I'm Ryan, CTO at airCloset. In my previous posts, I introduced an MCP server that lets you search all company databases in natural language and showed the full picture of our 17 internal MCP servers. This time, I'm diving deep into what I briefly mentioned as "Biz Graph." This is the story of how we represented the relationship between business initiatives and KPIs as a graph structure, enabling AI to answer "Did that initiative actually work?" To get more value from AI, what matters is not just feeding it data — it's conveying the relationships between data. If your data volume is small enough, tools like NotebookLM can deliver great results. But you can't fit all your business data into a context window. Initiative reports, KPI spreadsheets, marketing weekly reports, logistics daily metrics — you simply cannot dump all of that into a prompt. That's why I believe the best available option right now is Graph RAG: making the right data searchable at any time, along with its relationships. When AI is asked "What metrics are related to this initiative?", it can traverse the graph and extract only the information it needs — because that structure was built in advance. But there's a catch. Many of you have heard of "knowledge graphs" and "GraphRAG." But when you actually try to build one, most people hit the same wall: Business data doesn't naturally form a graph. With our DB Graph project, things were different. Tables had foreign keys. ORMs had @JoinColumn and belongsTo. Relationships already existed in the data — we just had to parse and convert them. But the relationship between "initiatives" and "KPIs" has none of that. A meeting slide says "SNS ad campaign launched" A spreadsheet records "This week's new members: 1,234" There's no FK between these. No join key. "The SNS campaign affected new member signups" — that relationship exists only in someone's head. It's nowhere in the spreadsheet. This is what "business data doesn't form a graph" means. The relationships between entities aren't self-evident — you have to design the graph structure itself. Every week, our company reports initiative progress in all-hands meetings and group-level standups. "We launched the spring SNS ad campaign" — Dozens of initiatives reported weekly. Hundreds per year. Over 5,000 total. Meanwhile, a separate spreadsheet tracks 200+ metrics daily and weekly: member count, new signups, retention rate, satisfaction scores, acquisition CPA... The problem: these two worlds are completely disconnected. "How much did last month's SNS campaign contribute to new member acquisition?" Answering this requires: Confirm the initiative's execution period (which slide was that again?) Find KPI data for that period (which sheet, which tab?) Align timeframes and compare numbers (week-over-week? month-over-month? year-over-year?) Check if other initiatives were running simultaneously (confounding factors?) This manual analysis takes 30-60 minutes, happening every week for multiple initiatives. Realistically, most initiative effectiveness reviews end with "it probably worked, I think." We built Biz Graph to solve this. Note: The numbers below differ from actual values but convey the order of magnitude. In any case, this is far too much data to fit in an LLM's context window. Resource Count Nodes ~10,000 (14 types) Edges ~71,000 (22 types) Initiatives ~5,000 KPI Metrics ~4,000 (members/signups/retention/satisfaction/UX/marketing/logistics) Marketing Channels ~100 (SEM/LINE/email/CRM etc.) Data Sources 9 tables/spreadsheets Biz Graph Transformer — Weekly graph rebuild from all data sources (Cloud Run Job, every Friday 22:00) Biz Graph MCP Server — Graph search + time series analysis accessible from AI (Cloud Run) Biz Data Loader — Daily auto-import of marketing/logistics data (Cloud Run Job, every morning 6:00) Here's the heart of this article. How do you connect "initiatives" and "metrics" in a graph? The obvious first thought is direct edges: Initiative("SNS campaign") ──AFFECTS──→ Metric("new_members") This design breaks down. Three reasons: Edge explosion: 5,000 initiatives × 4,000 metrics = up to 20 million edges Causal uncertainty: "SNS campaign affected new members" is a hypothesis, not a fact. Direct edges make it look like a confirmed relationship Missing temporal info: There's no way to express when the impact occurred Instead, we designed Week nodes as shared anchors for indirect connections. Initiative("SNS campaign") ──ACTIVE_DURING_WEEK──→ Week:2026-03-03 Metric("new_members") ──HAS_DATA_AT──→ Week:2026-03-03 QualityMetric("avg_rating") ──HAS_QUALITY_DATA_AT──→ Week:2026-03-03 MarketingChannel("SEM brand") ──HAS_MARKETING_DATA_AT──→ Week:2026-03-03 Initiatives and metrics aren't directly connected — they're indirectly linked through the same week. 1. Prevents edge explosion Initiatives only connect to "weeks they were active." Metrics only connect to "weeks that have data." Instead of a cross-product, each connects independently to Week nodes — edge count grows linearly. 2. Expresses co-occurrence, not causation "Initiatives that were active the same week as metric fluctuations" — this isn't asserting causation, it's a structure for discovering causal candidates. It leaves room for human or AI judgment. 3. Edge types distinguish data sources Same Week node, but HAS_DATA_AT (business KPIs), HAS_QUALITY_DATA_AT (service quality), HAS_UX_DATA_AT (UX metrics), HAS_MARKETING_DATA_AT (marketing), HAS_LOGI_DATA_AT (logistics) — "what kind of data" is embedded in the edge type itself. 4. Time series traversal is natural Week nodes are connected by NEXT_WEEK edges. "How did metrics change in the 3 weeks before and after initiative start?" can be expressed as graph traversal. Week nodes tell us "what happened the same week," but not which metrics are relevant to a given initiative. There's no point looking at logistics data when analyzing an SNS ad campaign. However, there's no join key between initiative categories ("Marketing (Advertising)") and metric groups ("New Acquisition"). The knowledge that "ad initiatives relate to new acquisition" is tacit — it exists only in people's heads. MetricDomain (6 domains) structuralizes this tacit knowledge. Domain Meaning Connected metric types acquisition New acquisition Marketing channels, new member count, registration CV retention Retention / churn prevention Member count, churn rate, plan transitions service_quality Service quality Satisfaction, ratings operations Operations Selection, shipping, returns, logistics KPIs ux UX experience Sessions, funnels revenue Revenue / purchases Purchase CV, upsell By humans defining the mapping between initiative categories and MetricDomains, and between metric groups and MetricDomains, we enable "automatically show acquisition-related metrics when viewing a marketing initiative." Category("Marketing ads") ──CATEGORY_IN_DOMAIN──→ MetricDomain("acquisition") ↑ IN_DOMAIN MetricGroup("New Acquisition") MarketingChannel("SEM brand") UxMetric("registration_completed") Result: Pass domain: "acquisition" to compare_metrics, and the initiative overlay automatically filters to acquisition-related initiatives only. Another unique design element: SIMILAR_TO edges. Initiative text (title + description) is vectorized to 768 dimensions using Vertex AI's gemini-embedding-001, then BigQuery's VECTOR_SEARCH auto-detects similar pairs with cosine similarity >= 0.75. SELECT base.id, query.id, distance FROM VECTOR_SEARCH( TABLE cortex.biz_graph_nodes, 'embedding', (SELECT id, embedding FROM cortex.biz_graph_nodes WHERE node_type = 'Initiative'), top_k => 6, distance_type => 'COSINE' ) WHERE base.id != query.id AND distance = 0.75 Currently ~13,000 SIMILAR_TO edges exist. Up to 5 similar initiatives are pre-computed for each one. "Didn't we run a similar SNS campaign last summer? How did that one perform?" — traverse similar initiatives on the graph instantly, then compare KPI changes during weeks those initiatives were active. Here's how exploration works via MCP tools. All tool execution examples below run through MCP from an AI coding agent. The response format matches the real system, but numbers are dummy values and content is simplified. search_initiatives({ "query": "SNS advertising for new acquisition", "domain": "acquisition", "dateFrom": "2025-10-01", "dateTo": "2026-03-31", "limit": 5 }) Response (excerpt): 5 initiatives found (by vector similarity): 1. SNS Ad Spring Collection Campaign (2026-03-09) Category: Marketing (Advertising) Similarity: 892/1000 2. Instagram Reels Ad Test (2026-02-23) Category: Marketing (Advertising) Similarity: 845/1000 ... get_initiative_context({ "initiative_id": "Initiative:2026-03-09:SNS Ad Spring Collection Campaign", "metric_window_days": 30 }) Response (excerpt): ## Initiative Context Title: SNS Ad Spring Collection Campaign Execution Period: 2026-03-01 to 2026-03-31 Category: Marketing (Advertising) Target Domain: acquisition ## Similar Initiatives (SIMILAR_TO) - Instagram Reels Ad Test (similarity: 0.82) - 1-Month Free Trial Campaign (similarity: 0.78) ## KPI Changes During Initiative (30-day window) | Metric | Pre-avg | Post-avg | Change | |--------|---------|----------|--------| | new_regular | 50 | 60 | +20.0% | | new_lite | 30 | 35 | +16.7% | | monthly | 1,000 | 1,050 | +5.0% | ## Service Quality Metrics | Metric | Before | After | Change | |--------|--------|-------|--------| | avg_rating | 3.50 | 3.60 | +2.9% | ## UX Metrics | Metric | Before | After | Change | |--------|--------|-------|--------| | total_sessions | 10,000 | 12,000 | +20.0% | | registration_completed | 100 | 130 | +30.0% | This is the power of the Week node design. Identify the weeks an initiative was active, then automatically pull all metrics (KPIs, quality, UX, marketing, logistics) from those same weeks. compare_metrics({ "metrics": ["new_regular", "new_lite", "new_monthly"], "dateFrom": "2025-10-01", "dateTo": "2026-03-31", "granularity": "weekly", "overlay_initiatives": true, "domain": "acquisition" }) Time series data with acquisition-domain initiatives overlaid on the same timeframe. KPI spikes become instantly attributable to "that initiative's timing." The graph is constructed in 9 phases: Phase Content Output 1 Initiative nodes + Category/Business/Team Initiative, Category, Business, Team 2 Daily KPIs (50 metrics) Metric → MetricGroup (10 groups) 3 Business KPIs + Departments Department → Metric (DEPT_TRACKS) 4 Week nodes (shared anchors) HAS_DATA_AT + ACTIVE_DURING_WEEK + NEXT_WEEK 5 Service quality metrics (~50) QualityMetric → Week 6 UX metrics (~40) UxMetric → Week 7 Marketing channels (~100) MarketingChannel → Week 8 MetricDomain (semantic bridge) 6 domains + IN_DOMAIN + TARGETS_DOMAIN 9 Logistics KPIs (~10 categories) LogiMetric → Week Phases 4 and 8 are the key design points. Other phases simply "turn data into nodes" — these two "structuralize relationships that don't exist." // Convert initiative execution period to ISO weeks, generate ACTIVE_DURING_WEEK edges for (const initiative of initiatives) { const weeks = getISOWeeksBetween( initiative.executionStartDate, initiative.executionEndDate ); // Cap at 52 weeks (guard against long-running initiatives) for (const week of weeks.slice(0, 52)) { edges.push({ edge_type: 'ACTIVE_DURING_WEEK', source_id: initiative.id, target_id: `Week:${week}`, }); } } // Generate HAS_DATA_AT edges for weeks that have metric data for (const metricWeek of metricWeeks) { edges.push({ edge_type: 'HAS_DATA_AT', source_id: `Metric:${metricWeek.metric}`, target_id: `Week:${metricWeek.week}`, }); } // NEXT_WEEK edges for time series traversal const sortedWeeks = [...allWeeks].sort(); for (let i = 0; i = { 'Marketing (Advertising)': ['acquisition'], 'CRM / Retention': ['retention'], 'Quality / Service Improvement': ['service_quality'], 'Operations Improvement': ['operations'], 'New Feature': ['ux', 'revenue'], // ... }; // Initiative → TARGETS_DOMAIN (main business only — limited to where KPI data exists) for (const initiative of initiatives) { if (initiative.business !== MAIN_BUSINESS) continue; const domains = CATEGORY_TO_DOMAINS[initiative.category] ?? []; for (const domain of domains) { edges.push({ edge_type: 'TARGETS_DOMAIN', source_id: initiative.id, target_id: `MetricDomain:${domain}`, }); } } We implemented the graph using BigQuery alone, without Neo4j, Amazon Neptune, or OSS like Microsoft's GraphRAG. Aspect Dedicated Graph DB BigQuery Graph traversal Fast (native) Fast enough (~10,000 node scale) Vector search Requires separate service VECTOR_SEARCH built-in Time series analysis Weak Native (window functions) Operating cost Always-on instances Serverless (pay per query) Joining other data ETL required Same project, instant JOIN For Biz Graph, "graph structure + time series analysis + vector search combined" matters more than "deep graph traversal." BigQuery handles all three in one engine. Additionally, BigQuery has announced Graph capabilities — once GA, native graph queries on node/edge tables will be available. Currently we traverse with SQL JOINs, but we expect to migrate to faster, more intuitive queries in the future. OSS like Microsoft GraphRAG and various Graph RAG SaaS products focus on automatically extracting entities and relationships from text documents. Great for research papers or news articles, but not for our use case. The reason is simple: we need to design the graph structure itself. The concept of Week nodes as "temporal anchors" doesn't exist in generic tools MetricDomain "semantic bridging" reflects our specific business structure The Initiative → Week → Metric indirect connection pattern won't emerge from LLM entity extraction Generic tools "auto-generate graphs from text." What we needed was "design the graph schema ourselves and integrate heterogeneous data sources." Fundamentally different problems. Internal query example (get_initiative_context): -- Get weeks the initiative was active WITH active_weeks AS ( SELECT target_id AS week_id FROM cortex.biz_graph_edges WHERE source_id = @initiative_id AND edge_type = 'ACTIVE_DURING_WEEK' ), -- Get metrics that have data in those same weeks co_occurring_metrics AS ( SELECT e.source_id AS metric_id, e.edge_type, w.week_id FROM cortex.biz_graph_edges e JOIN active_weeks w ON e.target_id = w.week_id WHERE e.edge_type IN ( 'HAS_DATA_AT', 'HAS_QUALITY_DATA_AT', 'HAS_UX_DATA_AT', 'HAS_MARKETING_DATA_AT' ) ) SELECT * FROM co_occurring_metrics Graph traversal and time series data retrieval complete in a single SQL query. With a dedicated graph DB, you'd need to pass traversal results to another service for time series queries — an extra hop. Graph quality depends on source data quality. Initiative data comes from all-hands and group meeting slides. Source Format Frequency All-hands pptx in Drive → Slides conversion → text extraction Weekly Group standups Google Slides (cumulative, latest week appended) Weekly Text is extracted from meeting slides and structured by AI into the initiative table. interface InitiativeRow { meetingDate: string; // Meeting date source: string; // Source (all-hands / group standup etc.) business: string; // Business unit category: string; // Marketing (Ads), New Feature, ... title: string; // Initiative title description: string; // Detailed description team: string; // Executing team executionStartDate: string; // Execution start date executionEndDate: string; // Execution end date metrics: string; // JSON format numeric metrics status: string; // planned / in_progress / retrospective } Critical: executionStartDate / executionEndDate. The meeting date (meetingDate) differs from when the initiative actually runs. "We started the SNS campaign last week," reported on 3/9, means executionStartDate is 3/1. This distinction is essential for accurate Week node connections. Resource Cost Vertex AI Embedding (weekly) ~$0.05/run Claude Code (initiative extraction) Within monthly plan BQ storage A few GB (negligible) Cloud Run Jobs Nearly free (1x weekly + 1x daily) MCP Server Nearly free (Cloud Run min-instances=0) A few dollars per month to maintain a 10,000-node, 71,000-edge graph. Let's revisit the "why Graph RAG?" question from the introduction. For initiative effectiveness analysis, consider what happens with standard vector search (flat RAG). Ask "What was the SNS campaign's impact?" — flat RAG returns text chunks similar to the initiative description. You get info about the initiative itself. But it won't return concurrent KPI changes. It won't return results from past similar initiatives. It won't return related domain metrics. These are information connected "through the graph," not by "text similarity." You can only reach them by traversing Week nodes. This "need to follow relationships" use case is exactly where Graph RAG has a clear advantage over flat RAG. One thing I was conscious of in this design: not asserting causation. Many BI tools and AI analyses want to declare "this initiative impacted this KPI." But in reality, there's no such certainty. Multiple initiatives may have been running simultaneously, it could be seasonal, it could be external market changes. Week node indirect connections simply "lay out what happened in the same period." Causal judgment is left to human or AI reasoning. I believe this is a statistically honest approach. "A structure for discovering causal candidates" — not "a structure for asserting causation." This distinction matters. Let me be honest about the weaknesses of this approach. MetricDomain mappings ("Marketing Advertising → acquisition domain") are hardcoded by humans. If this design is wrong, the entire graph's exploration results are skewed. This is simultaneously the answer to "why build it yourself." Off-the-shelf graph tools can't reflect your business structure — which initiative categories relate to which metric groups. Structuralizing this tacit knowledge requires someone who knows the business. Going forward, we're considering having AI propose these mappings with humans reviewing them. Full automation is hard, but an "AI suggests, humans approve" workflow could reduce the maintenance cost of domain knowledge. Turning business data into a graph is more of a design challenge than a technical one. There's no FK between "initiatives" and "KPIs." No join key. But by designing two artificial structures — temporal axis (Week nodes) and semantic domains (MetricDomain) — it becomes an explorable graph. Week nodes: Indirect connections via "same week" instead of direct initiative-metric edges. A structure for discovering causal candidates MetricDomain: Semantic bridge between initiative categories and metric groups. Structuralized tacit knowledge SIMILAR_TO: Pre-computed similar initiatives via AI Embedding. Instant answers to "have we done this before?" As a result, questions like "Did that initiative work?", "Find initiatives that drove acquisition", "Show metrics YoY with initiative overlay" — AI can now autonomously explore the graph to answer these. Graphs aren't something you "find" — they're something you design. Especially for business data. At airCloset, we're looking for people who want to redefine how we work alongside AI. If interested, check out airCloset Quest.
