Integration Guides
Guide: Ingesting Gong Transcripts For RAG
Learn how to build Gong transcript ingestion pipelines for RAG use cases.
Ingesting call transcripts from a Gong integration is incredibly useful for any meeting assistant, sales product, or enterprise search type AI application. That's why companies like tl;dv and Copy.ai have used Paragon to build native Gong integrations into their AI product that can pull their customers' organization-specific sales context. The steps for building a native Gong ingestion integration are:
End-user setup and authentication
Gong transcript ingestion
Permissions handling
Indexing transcripts
These steps are necessary for a production-ready, end-to-end experience that starts with your users authenticating into your Gong integration, and ends with your your AI product having context from all of their Gong transcripts.
1) User Setup
User OAuth
Integrating with your users’ Gong data starts with your users authenticating to Gong within your application. To enable this, the first step is to setup an external application in the Gong Developer Hub. This is where you register your application with Gong - including redirect URLs, app logos, and API scopes. The second half of OAuth is setting up your application’s OAuth handling routes to handle access codes and store tokens.
Some important tips for handling tokens:
When the
GET /oauth2/authorize
is used initially, make sure the requested scopes match the scopes configured in the Gong developer hubUse the
POST /generate-customer-token
endpoint with your access code to get an access and refresh tokenAccess tokens expire after 1 day by default
From there, your users should be able to authenticate into their Gong account and your application can then access their Gong data using tokens returned from the Gong API!


User Configuration
Once Gong OAuth is completed, users of your SaaS application may expect the ability to configure their Gong integration with your application. This may include the ability to disconnect their Gong account, select specific calls to use within your application, or select a sync frequency that your application is up to date on.
Below is an example of a Gong integration portal provided by Paragon where users can authenticate and configure their settings. Paragon’s Connect Portal is pre-built and can be easily added into any product UI with our node.js SDK to support Gong integration authentication and configuration natively.

2) Gong Data Ingestion
Architectural Design
Ingesting all Gong call transcripts for an enterprise customer can be a massive amount of data. Your services that perform the data ingestion will likewise need to be able to handle that data reliably and stay up-to-date with new calls from Gong.
This is an example of a Gong ingestion system:

For large background jobs like data ingestion, having an event-driven architecture with queues to allow for retries and failures are essential. In addition, because data ingestion involves an initial historical sync and updated syncs, services that can handle on-demand data pulls and event-triggered data pulls are necessary.
Initial Historical Sync
When a user first enables their Gong integration within your SaaS application, you’ll needs to sync their existing call data/calls allowed in their user configurations. The GET /v2/calls
returns all calls from your users’ Gong account. For a specified time window you can use the query parameters ?fromDateTime=<ISO-FORMAT>&toDateTime=<ISO-FORMAT>
.
To get extensive data with metadata like attendees, the POST /v2/calls/extensive
endpoint can be used with a callId
filter in the request body. What’s most useful for RAG is the transcript data. Your application can access these transcripts by calling the POST /v2/calls/transcript
endpoint, again with the callId
filter in the request body.
The transcript data will look like this:
Updated Syncs
The initial historical sync will handle the most amount of data as the process could be ingesting hundreds of call transcripts. Updated syncs are how your RAG application has fresh call data from Gong so your users can retrieve recent call data. There are two patterns for update syncs:
Webhook-triggered syncs: whenever a new call is created in Gong, kick off a data ingestion sync with that new call
Schedule-based syncs: Setup a default cadence or allow users to select how often they’d like to sync their Gong data (i.e. poll for new calls every week, day, hour, etc.)
Webhook-triggered syncs will provide the closest real-time experience for your users, however you’ll need to be careful to make sure your webhooks can handle multiple tenants and map those events correctly. Webhooks can be notoriously unforgiving as once a webhook event is fired, if your application misses/mishandles the event, you’ll need to manually process the event with the Gong API.
To setup webhooks in Gong, you’ll need to enable rules in the Automations
tab of the Developer Hub. From there you need to configure the endpoint that listens for webhook events as well as the webhook authentication.

For schedule-based syncs, you can use the GET /v2/calls?fromDateTime=<ISO-FORMAT>&toDateTime=<ISO-FORMAT>
endpoint to query for calls in the time window according to your schedule. This implementation puts your application in control of pulling calls, however it’s less real-time and there are many “empty trips” where your application is polling for new calls even if no new calls exist.
With Paragon, both webhook-triggered and schedule-based syncs can be represented as Paragon Workflows allowing developers to easily setup these updated syncs and define logic like javascript functions, branching logic, and for loops within a Workflow. Paragon is also releasing a solution - Managed Sync - that allows our customers to retrieve their users’ up-to-date Gong data with just an API call if all you need is your users’ Gong data.

3) Permissions Handling
Permissions are essential for any product-ready application. It’s relatively easy to put together an MVP of a RAG-enabled application, however a production-ready application should only allow an LLM to retrieve data that the authenticated user has permissions to. Your enterprise customers may not want everyone on their sales team to have access to Coaching and Scoring stats.
Gong’s permissions structure has Workspaces, Permissions Profiles, Users, and Calls:
Call access can be given to all Users in a Workspace, Users under a manager, or even specific one-off Users
Users with certain Permissions Profiles have the ability to manage folders and score calls
There are also specific objects like Coaching, Insights, and Stats that certain Permissions Profiles have access to
We can implement permissions in our RAG application that follow Gong’s native permissions with a few different patterns which we’ll discuss next.
Permissions Patterns
There are a few different design patterns we’ve seen from our customers to enforce permissions in their RAG workflow. We’ll be going in-depth into two methods in particular:
Checking with the Gong API at prompt-time
Modeling Gong permissions in a self-hosted database
Checking with the Gong API at prompt-time
Whenever your AI application retrieves context from Gong, check the Gong API to ensure the authenticated user has permissions to the call.

This pattern is the safest way to enforce permissions as your application is always consulting the Gong API, the source of truth. Changes to permissions in Gong are immediately reflected in your RAG application.
Where this pattern potentially breaks down is when your top-K increases. If your RAG application needs to send tens of calls to external APIs to check permissions before returning an answer, latency will be affected. For chatbots, response times need to be kept short. However, for use cases where your RAG workflow may not need to synthesize tens of documents from external sources or when used outside of a chatbot use case where slightly longer response times are bearable, checking Gong permissions post-retrieval is the safest pattern to use.
Modeling Gong permissions
The second method for enforcing permissions is modeling permissions within your application using a database like a ReBAC graph or ACL table. Storing permissions in your database layer makes it much faster to check permissions compared to API calls, decreasing latency if multiple permissions need to be checked - i.e. RAG answer contains sources to a Gong call, Google Drive files, and Notion where permissions need to be checked across sources.
ACL (Access Control List) involves keeping records of all calls with users that have permissions to that call. When your RAG application retrieves data from a call, your application will query the ACL to ensure the user is part of the list of acceptable users.
Gong Call ID | Users |
---|---|
a46921 | [john@acme.com, tim@acme.com, …] |
c4867271 | [lyla@acme.com] |
The ACL method is a straightforward model, however if you remember how Gong permissions work - where calls can be given permissions to specific users or to reports under a manager and permissions profiles correspond with permissions to additional data - flattening these permissions to a table can get messy.
ReBAC (Relationship-based Access Control) graphs can model these relationships better. Visually, it’s a bit messier, we can model permissions based on management, permissions profiles, and propagate complicated permissions such as “all reports should have access to calls within their team.” Another added advantage of using a ReBAC graph is that permissions changes are easy to update. If “Tim” decided to move to “Lyla’s” team, only one edge needs to be updated. With ACLs, many call records would need to be updated.

Permission Updates
Each pattern has their tradeoffs:
Checking the Gong API at prompt-time
Pros: don’t need to store permissions, always up-to-date with permissions changes, uses the native Gong permissions
Cons: scalability for low-latency use cases where multiple calls/data sources need to have permissions checked
Modeling Gong permissions
Pros: low-latency that can handle many permissions checks (faster than API calls)
Cons: permissions can become stale
We can mitigate this last con for modeling Gong permissions, similar to our “Updated Syncs” section for ingesting Gong data. Storing Gong permissions is not dissimilar to ingesting Gong data; in fact, it’s like ingesting Gong permissions.
We can use the same mechanisms - webhook-based and schedule-based syncs - to ensure that our permissions data is up-to-date.
4) Indexing Transcripts
Indexing Strategies
Transcripts are a form of unstructured data and can be indexed just as like a document. Starting with chunking, fixed-size chunking (512 or 1024 tokens) is appropriate. There are other interesting ways of chunking transcripts, including semantic chunking - chunk break points are drawn whenever sentences are semantically different - however the effectiveness of these methods over fixed-size chunking is unproved (read the research on this topic if interested).
Another suggestion for indexing Gong transcripts in particular is including speaker labels. Gong transcripts are formatted in JSON format where each speaker has a speakerID.
Converting this JSON to text will require concatenation of the sentences text
to create a transcript text. To identify the speaker using speakerId
, use the POST v2/calls/extensive
to get the array of parties
which can be used to map speakers with speakerIds.
Metadata Inclusion
Records in a vector database by default only include the underlying text and a vector representation of that text. While these are the only pieces of data needed for RAG, metadata fields can also be used to enhance RAG data or as filters when querying for RAG. Consider including these fields as metadata:
Call ID: helps keep track of what call the data is from; also how you enforce permissions with metadata filtering
Parties: keep track of the people that attended the call the data is from
Timestamp: gives users information on how recent retrieved data is
Topic labels: other use case specific labels your application may find useful, such as labeling calls as “introductory,” “upsell,” “checkin,” etc.
Vector Database Namespaces
Namespaces are how you partition data in your vector database. When retrieving data for RAG, your application can only query from a single namespace at a time. This is useful for use cases like multi-tenancy for your B2B SaaS or development/testing/production environments.
You would never want a RAG query to return data from an organization outside of your users’ own organization. Thus keeping your enterprise customers’ data in separate namespaces is safer than using metadata filters.


Despite having the same text to search on, these 2 customers have separate namespaces and cannot access each other’s data.
Wrapping Up
These are the most important considerations for Gong data ingestion and building an end-to-end Gong integration for RAG:
User Setup: allowing users to authenticate into their Gong account from your application and setup their integration
Gong data ingestion: pulling historical calls and syncing new calls
Permissions handling: enforcing Gong permissions as part of the RAG workflow
Indexing transcripts: effectively upserting data to vector database for RAG retrieval
The entire process of building Gong data ingestion can be quite long and taxing on your engineering resources. As mentioned briefly, Paragon’s Workflow and Managed Sync solutions were purpose-built for 3rd-party integrations and AI use cases like RAG, cutting down the development time for building Gong data ingestion. If you’d like to learn more about Paragon, explore our popular use cases or book a demo with our team.
TABLE OF CONTENTS
Jack Mu
,
Developer Advocate
mins to read