Build vs. Buy - AI Use Cases

Build vs. Buy - AI Use Cases

Build vs. Buy - AI Use Cases

TABLE OF CONTENTS
    Table of contents will appear here.
    Table of contents will appear here.

AI has prompted an explosion of new SaaS companies all applying AI to different industries and use cases. What has become apparent is that fitting AI into these different industries and use cases requires integrations with existing systems and platforms.

These integrations enable your AI application to have access to customer-specific knowledge by integrating with data from your users’ existing platforms like CRMs and external files, and perform agentic actions like creating new records and documents on behalf of your users. Some key use cases we are seeing for integrations from AI applications are:

  1. Data ingestion for RAG: using data from users’ file storage and other 3rd-party platforms

  2. Tool calling: using 3rd-party APIs to query or create data in your users’ 3rd-party platforms

  3. AI workflow products: enabling your users to define workflows in your product that have AI features, agent tools, and 3rd-party actions

To give some examples - for AI companies building AI products for sales, they have integrations across these use cases with CRMs like Salesforce and Hubspot and productivity apps like Slack, Google Calendar, and Gmail. For enterprise search & knowledge-based AI products, there are integrations with file storage integrations like Google Drive and Slack to search across business context.

But the question isn’t "should you build integrations", it’s how should you build integrations. In this article, we’ll look at the process of developing integrations for AI use cases in-house and with Paragon, as it’s the only embedded integrations platform that supports heavy asynchronous jobs as well as fast responsive synchronous calls, both necessary for developing a full-breadth of integration-enabled AI features.

In case you're curious, below is a quick product tour of the platform.

End-user experience and configuration

Before your AI application can start ingesting your users’ data (or interact with their 3rd-party APIs at all), your users must first authenticate, authorize, and configure their 3rd-party integrations natively within your application.

In-house:

Building in-house, you’ll have complete control over what the authentication portal looks like and how users can configure their settings between your app and the 3rd-party provider. This control comes with the responsibility for understanding the different nuances of how each API handles authentication and building the end-user interface for each integration your interested in adding.

Authentication & Authorization:

Building authentication and authorization for integrations means that OAuth is completely in your team’s hands: registering your app with the 3rd-party provider (on their developer console), choosing scopes, and setting up redirects. Most of these will be requisites even if you build with a solution like Paragon, but the real challenge comes when it comes to token management. Token refresh After the initial login, you will need to handle and store your users' 3rd-party access tokens and refresh tokens.

This means making sure tokens are always valid so users don’t have to authenticate constantly, building different refresh token logic depending on the 3rd-party provider’s token policies, properly storing and encrypting user tokens, and handling different credentials at the user and organization level.

You can read more about auth management here.

3rd-Party Configurations

For configuration, building in-house will mean building the logic for how fields should be mapped between your application and a 3rd-party integration provider. This would also mean storing those mappings per user. A great example is how Hubspot’s self-built integration with Salesforce allows users to configure actions in their app for actions in Salesforce.

With Paragon:

Buying an embedded integration platform like Paragon means that much of the work on authentication, authorization, and user configurations can be offloaded to Paragon.

Authentication & Authorization

Your team still needs to register your app with the 3rd-party platform so you have control over scopes and can setup a completely white-labeled experience. From there, once set up you can easily embed Paragon’s UI component (Connect Portal) directly in your application which will fully manage the OAuth workflow. If you want to use your own UI, you can opt for a headless implementation as well.

Whether you use our Connect Portal or a headless implementation using Paragon’s SDK, Paragon handles all the access tokens and refreshes across every integration (even custom integrations), which means your engineering team can call 3rd-party APIs or subscribe to 3rd-party webhook events through Paragon without ever needing to worry about different authentication or token handling methods.

3rd-Party Configurations

Additionally, Paragon’s Connect Portal provides out-of-the-box support for field mappings, filepickers, and common configuration settings such as enabling users to select a Slack channel for notifications to go to.

For field mappings between your application and CRMs, Paragon can surface and store these 3rd-party mappings for you to use. For filepickers, Paragon’s SDK can surface native filepickers like Google Drive’s directly in your application.

Integration Use Cases for AI:

After your users enable integrations in your application with the right auth and configurations, there are a few prevalent use cases for AI applications that are now possible. We’ll cover:

  1. RAG data ingestion

  2. AI agent tool calling

  3. AI workflows

Let’s start with data ingestion for RAG. (If you’re not sure what RAG is, check out our guide to RAG).

Use Case #1: Data Ingestion for RAG

Successful RAG implementations start with the ability to access and index data, but a production-ready implementation means much more:

  1. Access users’ data sources like their Google Drive, their Notion, their Confluence, etc.

  2. Extract the text from their data using the integration provider’s API

  3. Index it into your vector database for retrieval when your AI is prompted

  4. Implement a permissions pattern (i.e. keep a database of users to data permissions and only allow your AI application to return data that a user has permissions to)

  5. Stay up-to-date with changes in your users’ data and permissions (i.e. new files added to Google Drive, new read-only users added to a file’s permissions)

In-house:

After handling your users’ OAuth, your application can use your users’ 3rd-party tokens with API calls and start retrieving data.

Access, Extract, and Index Data

For steps 1-3, this means researching what APIs are available on the integration provider that can return your users’ data back to your application. For Google Drive, it may look like using their export endpoint to access and extracting the text from a Google Drive file. (If you’re interested on what it’s like using Google Drive APIs for data ingestion, read our first-hand experience on building data ingestion in-house).

const driveCreds = await getLatestDriveCredential(email);
const headers = new Headers();
headers.append("Content-Type", "application/json");
headers.append("Authorization", "Bearer " + driveCreds[0].access_token);
    
await fetch(`https://www.googleapis.com/drive/v3/files/${file.id}/export?` +
						 contentParams, {
    method: "GET",
    headers: headers
});

Your data service that calls 3rd-party APIs to extract data will need to handle massive data volumes, scale accordingly, and be fault tolerant. Think how much data is potentially in a users’ Google Drive or Salesforce CRM with years of data. Whether its through a cluster of micro-services that takes advantages of queues or a serverless service with lambda functions, data ingestion can be a quite costly and complex process.

After retrieving text from your users’ platforms, you can then index the data to the vector database of your choice by chunking the text into appropriate token lengths, transform the text to vectors using an embedding model, and finally upsert into a vector database.

Permissions Pattern:

Similar to accessing, extracting, and indexing data, step 4 requires those same steps but for permissions. Your application needs to keep permissions data for every data asset your AI application has ingested and enforce those permissions at your users’ prompt time.

Most 3rd-party integration providers will have an API that returns permissions for a given data asset, but there also many non-obvious permission sets that may not be easily retrieved.

{
	readers: [
		user1@example.com,
		user2@example.com
	],
	writers: [
		user3@example.com,
		user4@example.com
	],
	owner

From there you can store that data in a database like a relational table or a graph for your AI application to use.

Stay Up-to-Date

Lastly, it’s not enough to ingest your users’ data with its respective permissions once - your AI application must also be in-sync to any data and permission changes in each those 3rd-party applications. Whether its through webhooks or polling, a durable data sync service is necessary to make sure new data assets and permission changes are available to your AI application for re-indexing.

Your data sync service needs to accommodate messages from each incoming webhook and/or poll across every 3rd-party integration provider, and across every one of your customers. The pure number of webhooks and poll jobs you need can be difficult to manage as a single integration provider will likely require several webhooks - for Salesforce, you may need a listener per object (contact, account, opportunity, lead, etc). But not only are the volume of webhooks you need to listen for vast, the volume and scale of webhook messages you need to handle can be massive, especially for high-volume webhooks that are common for platforms like Shopify.

With Paragon:

Paragon’s Workflow engine is purpose-built for reliable, long-running, asynchronous jobs like data ingestion. We’ll break this down into a few components.

Access, Extract, and Index Data

Paragon Workflows can be authored in our low-code builder or in code via our SDK. These Workflows allows teams to easily define ingestion pipelines that can be triggered by your application - such as whenever a user authenticates & enables an integration in your app or through an API request from your app to a Paragon workflow endpoint. With pre-built triggers and Integration Actions that abstract common 3rd-party API use cases (like getting Salesforce contact data or getting Notion page contents), your team can save time researching 3rd-party APIs and focus on building logic.

Once your users’ data is piped to your application backend, your application can then chunk, embed, and index into your vector database in any way your team sees fit. Unlike many other integration platforms, Paragon maintains the full schema and payload from the 3rd-party data to ensure you have full access to all of your users' data, including custom objects and fields.

Permissions Pattern

Similar to building in-house, your AI application still needs to keep a permissions database with your users’ 3rd-party data assets. However, Paragon Workflows also make it easy to get permissions metadata from the 3rd-party APIs, supporting complex branching logic and fan-outs (similar to if statements and for loops) to make it easy to get permissions from any 3rd-party source, from individual documents to folders to groups (for a tutorial on how Paragon enables permission for RAG, see our permissions tutorial).

Stay Up-to-Date

We mentioned that Paragon workflows can be triggered when a user enables an integration or via API request, but an incredibly useful trigger type we haven't discussed yet is 3rd-party webhook triggers. The same workflow triggers that send text data and permissions data to your AI application backend can be triggered by 3rd-party webhook, where anytime data is changed in a 3rd-party provider (like a file is updated in Google Drive or a new editor is added to a file) your workflow will be triggered to push that new data/permission to your AI application.

Unlike building in-house, you don't need to:

  • Research how to work with the 3rd-party webhooks APIs

  • Build webhook-listening and queuing infrastructure to successfully catch all incoming events

All of these workflows - data ingestion, permissions ingestion, and webhook-triggered - all work on Paragon’s workflow engine that has built-in auto-scaling, resiliency, and monitoring that can handle high-volume webhook requests for production workloads.

Use Case #2: Tool Calling with 3rd-party APIs

Data ingestion is key for the RAG use case; tool calling is the key for the AI agent use case where your AI application doesn't just use your users’ 3rd-party data as context to answer questions, but also perform actions like creating and updating data in your users’ 3rd-party platforms. (If you’re not familiar with tool calling, check our article on what tools for AI agents are).

Every AI agent tool involves:

  1. A metadata JSON with tool definitions and inputs

  2. A custom function that runs whenever an agent calls that tool

In-house:

Building tools that use 3rd-party APIs means that for every tool you’d like to give your agent, you’ll need to manually define that tool’s metadata JSON and its custom function.

Metadata JSON

The metadata an AI agent needs to identify when a tool should be used and how it should be used (the inputs) is extremely important for a proper tool call. Manually writing a tool for something as simple as an email tool can take all of this JSON:

{
    "type": "function",
    "function": {
        "name": "GMAIL_SEND_EMAIL",
        "description": "Send a email in Gmail",
        "parameters": {
            "type": "object",
            "properties": {
                "toRecipients": {
                    "type": "array",
                    "description": "To : Specify the recipients as either a single string or a JSON array. (example: \\"[\\n  \\"recipient1@domain.com\\",\\n  \\"recipient2@domain.com\\"\\n]\\")",
                    "items": {
                        "type": "string"
                    }
                },
                "from": {
                    "type": "string",
                    "description": "From : Specify the email of the sender."
                },
                "subject": {
                    "type": "string",
                    "description": "Subject : Specify the subject of the message."
                },
                "messageContent": {
                    "type": "string",
                    "description": "Message Content : Specify the content of the email message as plain text or HTML."
                },
                "attachments": {
                    "type": "string",
                    "description": "Attachments : Accepts either a single file object or a JSON array of file objects (example: \\"[{file object}, {file object}]\\")"
                },
                "additionalHeaders": {
                    "type": "object",
                    "description": "Additional Headers : Specify any additional header fields here. (example: \\"{\\n  \\"reply-to\\": \\"Sender Name <sender@domain.com>\\",\\n}\\")"
                }
            },
            "required": [
                "toRecipients",
                "from",
                "subject",
                "messageContent"
            ],
            "additionalProperties": false

We’ve seen that tool calling can be unreliable with more complex tools and more specific use cases. To minimize situations where your agent incorrectly uses tools, your team needs to optimize each tool's description and properties through testing and iteration.

Custom Functions

For each tool, while the JSON metadata gives your AI agent the information it needs to know when and how to use a tool, a custom function needs to be written and is what will be run after the tool is called. Imagine you’d like to give your AI agent a tool to schedule Google Calendar events for your user. You would have to research Google Calendar’s different APIs for ones that could be used for tool functionality, find each API’s necessary URL and query parameters, and write functions for each tool. Here’s an example of all the custom logic you would need to get busy times for a user’s current week.

const getSchedule = FunctionTool.from(
  async () => {
	  function addDays(dateObj, days) {
        dateObj.setDate(dateObj.getDate() + days);
        return dateObj;
    }
	  const calReq = fetch('https://www.googleapis.com/calendar/v3/users/me/calendarList');
	  const calRes = await calReq.json();
	  
	  cals = [];
	  for(const cal of calRes.items){
		  cals.push({id:cal.id});
		}
		
		const busyReq = fetch('https://www.googleapis.com/calendar/v3/freeBusy', {
			method: "POST",
			body: JSON.stringify({timeMin: new Date(), 
				timeMax: addDays(new Date(), 7),
				items: cals})
		});
		const busyRes = await busyReq.json();
	  
    if (busyRes.busy) {
      return response.busy;
    }
    return "Calendar could not be pulled";
  },
  {
    name: "getSdrSchedule",
    description: "Use this function when a user asks to schedule a meeting with the SDR. This function returns when " +
        "the user is busy and UNABLE to meet." +
        "Times are provided in UTC format. Use the convertUtcDatetimeToPstDatetime tool to convert UTC datetimes to" +
        "PST before returning results.",
    parameters: {
      type: "object",
      properties: {},
      required: [],
    },
  },
);

const convertUtcDatetimeToPstDatetime = FunctionTool.from(
    ({datetime}: {datetime: string}) => {
        return JSON.stringify({pstDatetime: new Date(datetime).toString()});
    },
    {
        name: "convertUtcDatetimeToPstDatetime",
        description:
            "Use this function to convert UTC datetimes to PST. Use this function whenever UTC datetimes are given such as" +
            " after the getSdrSchedule function tool is used.",
        parameters: {
            type: "object",
            properties: {
                datetime: {
                    type: "string",
                    description: "datetime in UTC",
                },
            },
            required: ["datetime"],
        },
    },
);

With Paragon:

While Paragon’s Workflows are a perfect fit for data ingestion as it’s a use case that requires the ability to handle large amounts of data over long periods of time (asynchronous jobs), tool calling is a use case where real-time, fast-acting requests are necessary so that agents can immediately respond to users (synchronous jobs). Paragon’s ActionKit is our first-class solution for AI agent tools, providing access to 1000+ different tools with one set of APIs.

Metadata JSON

Paragon’s ActionKit is designed for AI applications to scale the number of tools it needs quickly and reliably for 3rd-party APIs your application may need to integrate with. Back to our example of needing tools for your agent to interact with Salesforce. With a single GET request to our actionkit.useparagon.com/.../actions endpoint, your AI agent will get the metadata JSON for hundreds of different tools - tools for searching Salesforce records, creating records, retrieving records with custom fields, etc. This is just one example of a tool that ActionKit provides.

{
    "name": "SALESFORCE_SEARCH_RECORDS_CONTACT",
    "description": "Triggered when a user wants to Search Contact in Salesforce",
    "parameters": {
        "type": "object",
        "properties": {
            "filterFormula": {
                "type": "string",
                "description": "Filter search : Search for records that match specified filters."
            },
            "includeAllFields": {
                "type": "boolean",
                "description": "Include All Fields"
            },
            "paginationParameters": {
                "type": "object",
                "description": "Pagination parameters for paginated results",
                "properties": {
                    "pageCursor": {
                        "type": "string",
                        "description": "The cursor indicating the current page"
                    }
                },
                "required": [],
                "additionalProperties": false
            }
        },
        "required": [],
        "additionalProperties": false

The desciptions for tools and inputs were also optimized by our team to increase tool accuracy for AI agents. If you’re interested in learning more on how to optimize tool calling for your AI product, we’ve performed research on the topic.

Custom Functions

Not only does ActionKit provide the metadata with the necessary descriptions and inputs for each 3rd-party tool, ActionKit wouldn’t be complete without also including a way to execute the custom function when a tool is called. For this, using a POST request to the actionkit.useparagon.com/.../actions with the action name and parameters is all it takes to call a tool.

for tool_call in message.tool_calls:
  run_actions_body = {
      "action": tool_call["name"],
      "parameters": tool_call["args"]
  }
  response = requests.post("https://actionkit.useparagon.com/projects/" + 
													  os.environ['PARAGON_PROJECT_ID'] + "/actions",
						  headers={"Authorization": "Bearer " + encoded_jwt}, 
						  json=run_actions_body)
  tool_result = response.json()

  outputs.append(
      ToolMessage(
          content=json.dumps(tool_result),
          name=tool_call["name"],
          tool_call_id=tool_call["id"],
      )
  )

This one block of code is all that’s necessary to provide the custom logic needed for hundreds of different tools using 3rd-party APIs.

An additional benefit of using ActionKit is that it provides custom actions on top of the 3rd-party API, such as NOTION_GET_PAGE_CONTENTS and GOOGLE_CALENDAR_GET_AVAILABILITY (remember how much custom code it took to implement this). These custom actions abstract much of the work it takes to work with 3rd-party APIs, making it easier to equip your agent with tools to complete tasks that may have required multiple API calls and additional context.

Use Case #3: AI Workflows

The last use case we’ve seen from many AI companies is adding 3rd-party integration steps to their AI workflow orchestration product. To clarify, this is different from Paragon's Workflows - instead, this would be a workflow feature within your own product.

In an example from Copy.ai, their workflow platform provides their users a library of 3rd-party webhook triggers (like when leads are created), AI steps (such as a node to scrape websites and profiles), and dozens of 3rd-party action (like sending a Microsoft Teams message). When it comes to building integration nodes for an AI workflow product like this, you' need:

  1. 3rd-Party Triggers

  1. 3rd-Party Actions

Building In-house:

3rd-Party Triggers

Similar to building data ingestion in-house, building workflow triggers in-house requires a service that listens to webhooks and/or polls a 3rd-party API for updates, for every single 3rd-party event type.

What’s difficult in scaling webhook triggers is not just the workloads that your data sync service potentially has to handle, but also the different webhook behaviors when building out integrations with different providers. For example, some integration providers will batch messages together to avoid overloading your webhook listener service, while others will send every message separately; some integration providers use JWT for auth, while others like Slack use HMAC. Building in-house will require specific integration knowledge of webhook and API behavior that is OK if building just one or two core integrations, but can become difficult if you need to support triggers across dozens of integrations in your workflow product.

3rd-Party Actions

Providing workflow steps that perform actions in 3rd-party platforms will require calling on 3rd-party APIs using your customer’s credentials. Just like 3rd-party triggers, this requires learning the different API behaviors from different providers. What this also means is that building 3rd-party workflow steps will be a very manual process for your team.

Take a Salesforce workflow step like creating contacts. There are dozens of fields (with obscure ones like “emailBounced” fields) that would need to be implemented with names, descriptions, and whether or not they’re required.

These 3rd-party workflow steps with proper names, descriptions, and required flags ensure that AI agents and users are able to populate these fields accordingly to automate work.

With Paragon:

The same Workflow and ActionKit products covered previously enable this workflow builder use case.

3rd-party Triggers

Paragon’s Workflows provide a wide array of 3rd-party triggers that can be used to subsequently trigger your AI-enabled workflow feature/product. Paragon’s platform not only reliably scales webhook infrastructure to accommodate multiple tenants across multiple integrations, but also provides out-of-the-box webhooks for 3rd-party providers that are difficult to integrate webhook with. One example is Salesforce.

Salesforce API makes it difficult to implement webhooks for multi-tenant applications (there are workarounds such as using their apex platform), so Paragon has built a webhook implementation for Salesforce that our users can work with easily to trigger common events like when records are created or updated.

These 3rd-party triggers can subsequently be wrapped as triggers in your own workflow product.

3rd-party Actions

Providing 3rd-party actions for your AI-enabled workflow product is not dissimilar to providing 3rd-party API tools for your agent. In the meeting summarization example, one of the AI steps in a workflow could be to have an AI agent decide on tools to create tasks in Jira or Asana. Another option is to have 3rd-party actions be a deterministic workflow step (as we can see in the copy.ai example).

Paragon’s ActionKit is a great fit for providing these 3rd-party action workflow steps as our API provides rich metadata - the parameter names, descriptions, and required fields - about each action. Our ActionKit GET request returns all of the fields you’ll need for any 3rd-party action, so you can list them programmatically in your workflow product. We even surface your users’ custom fields, important for many CRM integrations.

Calling the action is a simple POST request to the ActionKit endpoint. And with that, your team can scale the number of 3rd-party action steps extremely quickly without doing as much work into researching each 3rd-party API.

Want more details on how this is implemented? Check out our tutorial/sample app.

Developer Experience & Production Considerations

There can often be tradeoffs when it comes to developer experience when it comes to using an external platform. Integrations platforms are purposefully built for developers, and as a result not are built with production considerations like monitoring, version control, and maintenance for integrations development.

Monitoring

Building integrations in-house, you’ll most likely use your existing logging and traceability system to monitor authentication to the integration, 3rd-party API calls, webhooks, and other integration related logic.

Integrations platforms like Paragon also integrate with your existing logging and observability platforms, with the ability stream events directly to your organization’s Datadog, New Relic, Sentry, etc.

Paragon also supports monitoring directly in the product dashboard via Task History. Task History shows full workflow executions/traces, as well as actions that were executed through the ActionKit API. For each workflow action, you can see exactly what inputs and outputs are from each step, the user/tenant that triggered the task, and other useful fields your team can filter on.

Version Control and Releases

Similar to building in-house, Paragon’s integration platform follows best CICD (Continuous-integration, continuous-deployment) practices that your engineering team is used to. While ActionKit API calls can live in your code and are therefore version controlled with your codebase, workflows can also be represented as code via Paragraph.

Paragraph defines workflows as typescript to allow your workflows to always be version controlled. Any workflow built on Paragon can be pulled down to Paragraph code and Paragraph defined workflows are pushed up to Paragon’s workflow builder.

export default class extends Workflow {
  define(
    integration: ISalesforceIntegration,
    context: IContext<InputResultMap>,
    connectUser: IConnectUser<IPersona<typeof personaMeta>>,
  ) {
    // Define steps used in workflow
    const triggerStep = integration.triggers.recordCreated({
      recordType: 'Contact',
    });

    const searchByEmailStep = new RequestStep({
      url: `https://api.myapp.io/api/contacts?email=${triggerStep.output.contact.email}`,
      method: 'GET',
  });

In addition to version controlling, Paragon also provides the ability to have multiple environments for development, testing, and production so that your team can safely and confidently push integration changes.

Maintenance

Monitoring, version control, and release environments are most likely systems that your team currently has with your AI product. These are not integration specific. However maintaining integrations can be a burden when building 3rd-party API features in-house.

In our use case sections, we mentioned that 3rd-party API calls and webhooks all have different behaviors and implementations. These 3rd-parties can change their APIs and webhooks causing breaking changes to your AI application, meaning your team has to constantly stay on-top of update from 3rd-party providers.

Buying a platform like Paragon means that the burden of responsibility for maintaining integrations falls on our team. Our workflows and ActionKit API endpoints will stay the same even as the underlying 3rd-party API changes.

Summary: In-House vs Integration Platform

When deciding on whether to build integrations in-house versus buying a platform like Paragon, it’s not a straight forward decision. In summary:

If you’re interested in taking a look at Paragon’s solutions for you AI SaaS product, get in touch with our team to book a demo.