Linghua Jin

Posted on Jan 5

We Built an Open-Source Pipeline That Turned Meeting Notes Into a Live Knowledge Graph — And It Went Viral (200K Impressions)

#opensource #knowledgegraph #llm #python

🚀 The Result: 200K Social Impressions and Viral Engagement

Our latest project just exploded on LinkedIn with 200K+ impressions, and for good reason. We built something that solves a real problem most companies face: their meeting notes are a goldmine of untapped knowledge, but nobody has time to manually organize them.

💡 The Problem

Most companies sit on an ocean of meeting notes scattered across Google Drive. Inside those documents are:

Critical decisions that shape product direction
Action items and task assignments
Key relationships between people, projects, and initiatives
Institutional knowledge that disappears when people leave

But here's the catch: these documents are constantly changing. Traditional data pipelines would reprocess everything from scratch every time, wasting compute and money on unchanged files.

⚡ The Solution: Incremental Processing with CocoIndex

We built an open-source pipeline that:

Connects directly to Google Drive - no manual exports needed
Only processes what changed - incremental LLM extraction means zero reprocessing of unchanged docs
Builds a live knowledge graph - automatically extracts entities, relationships, and updates Neo4j in real-time
Production-ready - fully open-sourced under Apache 2.0 license

🔧 How It Works

The pipeline continuously:

Monitors your Google Drive for changes
Uses LLM to extract structured data (people, decisions, tasks, relationships)
Incrementally updates the knowledge graph - only changed documents get reprocessed
Serves fresh insights through Neo4j queries

Tech Stack:

CocoIndex - for incremental data processing
Neo4j - for the knowledge graph
LLM - for entity and relationship extraction
Google Drive API - for document access

📚 Full Tutorial Available

We've published a complete step-by-step tutorial with:

Full source code (Apache 2.0)
Architecture explanations
Setup instructions
Real examples

🔗 Links:

GitHub Repo: https://github.com/cocoindex-io/cocoindex (⭐ star it if you find it useful!)
Tutorial: https://cocoindex.io/blogs/meeting-notes-graph
Live Example: https://cocoindex.io/examples/meeting_notes_graph

🎯 Why This Matters

Incremental processing is the key differentiator here. Most pipelines are "dumb" - they reprocess everything even if only one document changed. That's:

❌ Expensive (LLM costs add up fast)
❌ Slow (unnecessary compute time)
❌ Wasteful (environmental impact)

With CocoIndex's incremental approach:

✅ Only pay for what changed
✅ Real-time updates
✅ Scales with your document library

🌟 Built With Open Source

This entire project is open source and production-ready. Whether you're a startup drowning in meeting notes or an enterprise looking to unlock institutional knowledge, you can deploy this today.

What could you build with a live knowledge graph of your company's meeting notes?

Drop your thoughts in the comments! And if you're working on similar problems, I'd love to hear about your approach. 👇

DEV Community