Connectors

Browse docs

Connectors

Tap to expand

Contribute

ConnectorsUpdated 2026-03-18

GitHub Connector

Sync GitHub repositories, issues, and documentation into RetainDB for intelligent code search and documentation retrieval.

The GitHub connector indexes your repository content, enabling AI-powered search across code, issues, PRs, and documentation.


Use Cases

  • Code Search — Find relevant code snippets across repositories
  • Documentation Retrieval — Query internal docs alongside code
  • Issue Context — Ground AI responses in existing issues and discussions
  • Onboarding — Help new developers find relevant code and docs

Prerequisites

  1. GitHub Personal Access Token with these scopes:

    • repo (full repository access)
    • read:org (if using organization repos)
    • read:user (for user-specific content)
  2. Repository Access — Token must have access to target repos


Configuration

Creating a GitHub Source

bash
curl -X POST "https://api.retaindb.com/v1/projects/proj_abc123/sources" \
  -H "Authorization: Bearer $RETAINDB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "My Company Docs",
    "connectorType": "github",
    "config": {
      "owner": "acme-corp",
      "repo": "platform-api",
      "branch": "main",
      "paths": ["docs/", "src/"],
      "include_issues": true,
      "include_prs": true,
      "include_wiki": false
    }
  }'

Configuration Options

OptionTypeDescriptionDefault
ownerstringGitHub organization or usernameRequired
repostringRepository nameRequired
branchstringBranch to syncmain
pathsarrayPaths to include (glob patterns)["**/*"]
exclude_pathsarrayPaths to exclude["node_modules/", "dist/"]
include_issuesbooleanIndex GitHub issuestrue
include_prsbooleanIndex pull requeststrue
include_wikibooleanIndex wiki pagesfalse
sync_modestringfull or incrementalincremental

Syncing

Trigger Initial Sync

bash
# Create source first, then trigger sync
SOURCE_ID="src_xyz789"

curl -X POST "https://api.retaindb.com/v1/sources/$SOURCE_ID/sync" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"

Response

json
{
  "id": "job_abc123",
  "source_id": "src_xyz789",
  "status": "queued",
  "created_at": "2026-03-07T12:00:00Z"
}

Check Sync Status

bash
curl "https://api.retaindb.com/v1/sync-jobs/job_abc123" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"

Response

json
{
  "id": "job_abc123",
  "source_id": "src_xyz789",
  "status": "completed",
  "progress": {
    "files_indexed": 450,
    "issues_indexed": 125,
    "prs_indexed": 78
  },
  "started_at": "2026-03-07T12:00:00Z",
  "completed_at": "2026-03-07T12:05:00Z",
  "error": null
}

Status Values

StatusDescription
queuedWaiting to start
runningCurrently syncing
completedSuccessfully finished
failedError occurred
cancelledCancelled by user

Incremental Sync

By default, the connector uses incremental sync, only fetching changes since the last sync:

bash
# Trigger incremental sync (usually happens automatically)
curl -X POST "https://api.retaindb.com/v1/sources/src_xyz789/sync" \
  -H "Authorization: Bearer $RETAINDB_API_KEY" \
  -d '{"sync_mode": "incremental"}'

Manual Full Sync

Forced full re-sync:

bash
curl -X POST "https://api.retaindb.com/v1/sources/src_xyz789/sync" \
  -H "Authorization: Bearer $RETAINDB_API_KEY" \
  -d '{"sync_mode": "full"}'

Searching Synced Content

After sync completes, search your code:

bash
curl -X POST "https://api.retaindb.com/v1/memory/search" \
  -H "Authorization: Bearer $RETAINDB_API_KEY" \
  -d '{
    "user": "developer@example.com",
    "query": "authentication implementation",
    "filters": {
      "source": "github:acme-corp/platform-api"
    },
    "topK": 10
  }'

Search Results

json
{
  "results": [
    {
      "id": "mem_abc123",
      "content": "async function authenticateUser(email: string, password: string) {\n  // Implementation here\n}",
      "source": "github:acme-corp/platform-api",
      "source_type": "code",
      "file_path": "src/auth/login.ts",
      "score": 0.94
    }
  ]
}

Filtering by Source Type

Filter search results by content type:

bash
curl -X POST "https://api.retaindb.com/v1/memory/search" \
  -H "Authorization: Bearer $RETAINDB_API_KEY" \
  -d '{
    "user": "developer@example.com",
    "query": "API endpoint",
    "filters": {
      "source": "github:acme-corp/platform-api",
      "source_type": "issue"
    }
  }'

Available Source Types

TypeDescription
codeSource code files
issueGitHub issues
prPull requests
wikiWiki pages
readmeREADME files

Webhooks

Configure webhooks to trigger syncs on GitHub events:

bash
# Create webhook
curl -X POST "https://api.retaindb.com/v1/webhooks" \
  -H "Authorization: Bearer $RETAINDB_API_KEY" \
  -d '{
    "url": "https://your-server.com/github-webhook",
    "events": ["github.push", "github.pull_request"],
    "source_id": "src_xyz789"
  }'

Then configure GitHub to send webhook events to your endpoint.


Troubleshooting

401/403 Errors

Cause: Token is invalid or lacks permissions

Solution:

  1. Verify token has required scopes
  2. Check token hasn't expired
  3. Ensure repo is accessible to token owner

Empty Sync Results

Cause: No matching files found

Solution:

  1. Check paths configuration includes correct directories
  2. Verify exclude_paths isn't too aggressive
  3. Confirm branch name is correct

Rate Limiting

Cause: GitHub API rate limit exceeded

Solution:

  1. Use a GitHub App with higher rate limits
  2. Reduce sync frequency
  3. Contact RetainDB for dedicated rate limits

Large Repository Performance

For large repos (>10,000 files):

json
{
  "config": {
    "paths": ["src/", "lib/"],
    "exclude_paths": ["node_modules/", "dist/", "*.test.ts"],
    "max_files": 5000
  }
}

Best Practices

1. Use Path Filters

Don't sync everything:

json
{
  "paths": ["src/", "docs/", "README.md"],
  "exclude_paths": ["node_modules/", "dist/", ".git/"]
}

2. Schedule Regular Syncs

Set up automatic incremental syncs:

bash
# In your CI/CD or cron
curl -X POST "https://api.retaindb.com/v1/sources/src_xyz789/sync" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"

3. Monitor Sync Status

Track sync health:

bash
# Get recent sync jobs
curl "https://api.retaindb.com/v1/sources/src_xyz789/sync-jobs?limit=10" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"

Next step

Was this page helpful?

Your feedback helps us prioritize docs improvements weekly.