Vector Tutorial: Conducting Similarity Search in Enterprise Data

Software engineers occupy an exciting place in this world. Regardless of the tech stack or industry, we are tasked with solving problems that directly contribute to the goals and objectives of our employers. As a bonus, we get to use technology to mitigate any challenges that come into our crosshairs.

For this example, I wanted to focus on how pgvector — an open-source vector similarity search for Postgres — can be used to identify data similarities that exist in enterprise data. 

A Simple Use Case

As a simple example, let’s assume the marketing department requires assistance for a campaign they plan to launch. The goal is to reach out to all the Salesforce accounts that are in industries that closely align with the software industry. 

In the end, they would like to focus on accounts in the top three most similar industries, with the ability to use this tool in the future to find similarities for other industries. If possible, they would like the option to provide the desired number of matching industries, rather than always returning the top three.

High-Level Design

This use case centers around performing a similarity search. While it is possible to complete this exercise manually, the

