Understanding Context Aggregation
Context aggregation is SignalPilot’s ability to gather and synthesize information from across your entire data stack—databases, notebooks, collaboration tools, and documentation—into a unified context for AI-powered investigations.Why Context Matters
Traditional AI tools require you to manually copy-paste context:The MCP Architecture
SignalPilot uses the Model Context Protocol (MCP) to connect to your data stack. MCP provides a standardized way for AI systems to access external context sources.Internal Context Sources
The internal MCP sidecar provides access to local context that doesn’t require external connections:Kernel Context
Access to the current Jupyter kernel state:| Context Type | What It Provides | Example Use |
|---|---|---|
| Variables | Active dataframes, values | ”What columns are in df?” |
| Execution History | Recently run code | ”What did I just calculate?” |
| Outputs | Cell outputs, plots | ”Explain this visualization” |
| Errors | Stack traces, exceptions | ”Why did this fail?” |
Schema Explorer
Direct database introspection:| Context Type | What It Provides | Example Use |
|---|---|---|
| Tables | List of available tables | ”What tables contain user data?” |
| Columns | Column names, types, nullability | ”What’s the schema of orders?” |
| Relationships | Foreign keys, joins | ”How are users and orders related?” |
| Indexes | Performance hints | ”Is this query optimized?” |
Query History
Recent database activity:| Context Type | What It Provides | Example Use |
|---|---|---|
| Past Queries | SQL that was previously run | ”Run that revenue query again” |
| Query Results | Cached result summaries | ”What did we find yesterday?” |
| Error History | Failed queries and reasons | ”Why did this query fail before?” |
File System
Local file access:| Context Type | What It Provides | Example Use |
|---|---|---|
| Notebooks | .ipynb files in workspace | ”What analysis is in notebook X?” |
| Data Files | CSVs, Parquet, JSON | ”Load the customer_data.csv” |
| Configs | Connection strings, settings | ”What database am I connected to?” |
External MCP Servers
External MCP servers connect SignalPilot to your broader data ecosystem:dbt Integration
dbt Integration
What it provides:
- Model lineage (upstream/downstream dependencies)
- Model documentation and descriptions
- Test results and data quality status
- Column-level lineage
- “What models feed into the revenue dashboard?”
- “Show me the lineage for customer_lifetime_value”
- “Are there any failing dbt tests?”
Setup Guide
Configure dbt Cloud or dbt Core integration
Slack Integration
Slack Integration
What it provides:
- Channel discussions about data and metrics
- Data team decisions and context
- Recent incident threads
- @mentions of specific metrics or tables
- “What did the team discuss about conversion last week?”
- “Are there any known issues with the orders table?”
- “Who owns the attribution model?”
Setup Guide
Configure Slack workspace integration
Jira Integration
Jira Integration
What it provides:
- Tickets related to data issues
- Deployment history affecting data
- Sprint context for data projects
- Issue status and assignments
- “Were there any deployments to the pipeline last week?”
- “Is there a ticket for the missing data issue?”
- “What’s the status of the ETL fix?”
Setup Guide
Configure Jira project integration
Notion/GDocs Integration
Notion/GDocs Integration
What it provides:
- Data dictionaries and glossaries
- Design documents and specifications
- Runbooks and troubleshooting guides
- Meeting notes and decisions
- “What’s the definition of ‘active user’ in our docs?”
- “Is there a runbook for pipeline failures?”
- “What decisions were made in the data review?”
Parallel Context Resolution
When you ask a question, SignalPilot resolves context from all relevant sources simultaneously:Context Prioritization
Not all context is equally relevant. SignalPilot uses semantic understanding to prioritize:| Priority | Context Type | When Used |
|---|---|---|
| High | Direct mentions (table names, metrics) | Always included |
| High | Recent query history | Always included |
| Medium | Related schemas | Included if space allows |
| Medium | dbt lineage | Included for data questions |
| Low | General Slack discussions | Summarized if relevant |
| Low | Old documentation | Referenced but not full text |
Context prioritization ensures the AI focuses on the most relevant information without being overwhelmed by tangential details.
Context Security
All context aggregation respects your security boundaries:Read-Only Access
Read-Only Access
SignalPilot only reads context—it cannot modify external systems. Database connections are read-only. Slack integration is read-only. No write access to any external MCP source.
Credential Management
Credential Management
Database credentials and API tokens are stored locally in your environment. They are never transmitted to SignalPilot servers. Connection strings are processed locally.
Data Minimization
Data Minimization
Only metadata and schemas are transmitted by default. Actual data values stay local unless explicitly included in a query. You control what context is shared.
Audit Trail
Audit Trail
All context resolutions are logged locally. You can see exactly what was fetched and when. Hooks can enforce additional access controls.
Learn More: Security & Privacy
Complete security model documentation
Configuring Context Sources
Adding a Database
Adding an MCP Server
Best Practices
Use Descriptive Names
Name your connections clearly (e.g., “production-snowflake”, “analytics-postgres”) for easy reference.
Review Context in Plans
When approving investigation plans, check that the right context sources are being used.