Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pgvector port #182

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Pgvector port #182

wants to merge 10 commits into from

Conversation

tozzen0121
Copy link

No description provided.

.env.example Outdated
PINECONE_ENVIRONMENT=environment
PINECONE_KEY=key

# POSTGRE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing s

requirements.txt Outdated
replicate==0.18.1
psycopg[binary]
psycopg2-binary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need new line

.env.example Outdated
PINECONE_KEY=key

# POSTGRE
POSTGRE_HOST=localhost
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be POSTGRES

get_postgre_cursor().execute('SELECT * FROM embedding ORDER BY values <-> %s LIMIT 50', (np.array(query_vector),))
query_matches = get_postgre_cursor().fetchall()

# query(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove any dead code

@@ -104,7 +112,7 @@ def smart_query(namespace, query, username: str):
)
used_messages = list(
filter(
lambda match: match["id"]
lambda match: match[2]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move magic number to constant

requirements.txt Outdated
replicate==0.18.1
psycopg[binary]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix these versions

@@ -4,6 +4,7 @@
from flask import Flask, request, jsonify
from .config import get_google_tasks_service_account
from .external_services.pinecone import get_pinecone_index
from .external_services.postgre_vector import get_postgre_cursor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

postgre-> postgres

Copy link
Contributor

@YuraLukashik YuraLukashik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a way to switch between the existing implementation with Pinecone and the new one with Postgres. Some level of abstraction is needed for that.

README.md Show resolved Hide resolved
README.md Outdated
Comment on lines 114 to 116
***


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is that space for?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me remove one line. is it ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need these 3 new lines?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah this is my mistake, let me remove them.

requirements.txt Show resolved Hide resolved
register_vector(cur)

# cur.execute('DROP TABLE IF EXISTS embedding')
cur.execute('CREATE TABLE IF NOT EXISTS embedding (id bigserial PRIMARY KEY, namespace text, chunk_id text, metadata text, values vector)')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we really do that during every connection?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you know app will not create table if there is already exist. Let me find best way

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I know that.


conn = None
try:
conn = psycopg2.connect(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should create a connection only when it's needed, not when Python modules are loaded.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it

except:
logging.exception("Couldn't insert embeddings")


def delete_pinecone_embedding(embeddings: list[Embedding], pinecone_index, pinecone_namespace):
def delete_db_embedding(embeddings: List[Embedding], postgres_cursor, namespace):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we make it abstract?

from .external_services.openai import create_embeddings, summarize_thread_with_chat_gpt_3_5
import datetime
from .external_services.slack_api import fetch_thread_messages, fetch_channel_messages, is_thread, \
is_actual_message, \
slack_names_map, filter_messages, load_previous_messages, load_subsequent_messages

import numpy as np
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used in this file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me remove

src/semantic_search/semantic_search/query.py Outdated Show resolved Hide resolved
src/semantic_search/semantic_search/query.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants