DEV Community

Cover image for We Built an Open-Source Pipeline That Turned Meeting Notes Into a Live Knowledge Graph — And It Went Viral (200K Impressions)
Linghua Jin
Linghua Jin

Posted on

We Built an Open-Source Pipeline That Turned Meeting Notes Into a Live Knowledge Graph — And It Went Viral (200K Impressions)

🚀 The Result: 200K Social Impressions and Viral Engagement

Our latest project just exploded on LinkedIn with 200K+ impressions, and for good reason. We built something that solves a real problem most companies face: their meeting notes are a goldmine of untapped knowledge, but nobody has time to manually organize them.

💡 The Problem

Most companies sit on an ocean of meeting notes scattered across Google Drive. Inside those documents are:

  • Critical decisions that shape product direction
  • Action items and task assignments
  • Key relationships between people, projects, and initiatives
  • Institutional knowledge that disappears when people leave

But here's the catch: these documents are constantly changing. Traditional data pipelines would reprocess everything from scratch every time, wasting compute and money on unchanged files.

⚡ The Solution: Incremental Processing with CocoIndex

We built an open-source pipeline that:

  1. Connects directly to Google Drive - no manual exports needed
  2. Only processes what changed - incremental LLM extraction means zero reprocessing of unchanged docs
  3. Builds a live knowledge graph - automatically extracts entities, relationships, and updates Neo4j in real-time
  4. Production-ready - fully open-sourced under Apache 2.0 license

🔧 How It Works

The pipeline continuously:

  • Monitors your Google Drive for changes
  • Uses LLM to extract structured data (people, decisions, tasks, relationships)
  • Incrementally updates the knowledge graph - only changed documents get reprocessed
  • Serves fresh insights through Neo4j queries

Tech Stack:

  • CocoIndex - for incremental data processing
  • Neo4j - for the knowledge graph
  • LLM - for entity and relationship extraction
  • Google Drive API - for document access

📚 Full Tutorial Available

We've published a complete step-by-step tutorial with:

  • Full source code (Apache 2.0)
  • Architecture explanations
  • Setup instructions
  • Real examples

🔗 Links:

🎯 Why This Matters

Incremental processing is the key differentiator here. Most pipelines are "dumb" - they reprocess everything even if only one document changed. That's:

  • ❌ Expensive (LLM costs add up fast)
  • ❌ Slow (unnecessary compute time)
  • ❌ Wasteful (environmental impact)

With CocoIndex's incremental approach:

  • ✅ Only pay for what changed
  • ✅ Real-time updates
  • ✅ Scales with your document library

🌟 Built With Open Source

This entire project is open source and production-ready. Whether you're a startup drowning in meeting notes or an enterprise looking to unlock institutional knowledge, you can deploy this today.

What could you build with a live knowledge graph of your company's meeting notes?

Drop your thoughts in the comments! And if you're working on similar problems, I'd love to hear about your approach. 👇

Top comments (0)