-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pgvector port #182
base: master
Are you sure you want to change the base?
Pgvector port #182
Conversation
.env.example
Outdated
PINECONE_ENVIRONMENT=environment | ||
PINECONE_KEY=key | ||
|
||
# POSTGRE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing s
requirements.txt
Outdated
replicate==0.18.1 | ||
psycopg[binary] | ||
psycopg2-binary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need new line
.env.example
Outdated
PINECONE_KEY=key | ||
|
||
# POSTGRE | ||
POSTGRE_HOST=localhost |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be POSTGRES
get_postgre_cursor().execute('SELECT * FROM embedding ORDER BY values <-> %s LIMIT 50', (np.array(query_vector),)) | ||
query_matches = get_postgre_cursor().fetchall() | ||
|
||
# query( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove any dead code
@@ -104,7 +112,7 @@ def smart_query(namespace, query, username: str): | |||
) | |||
used_messages = list( | |||
filter( | |||
lambda match: match["id"] | |||
lambda match: match[2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move magic number to constant
requirements.txt
Outdated
replicate==0.18.1 | ||
psycopg[binary] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix these versions
@@ -4,6 +4,7 @@ | |||
from flask import Flask, request, jsonify | |||
from .config import get_google_tasks_service_account | |||
from .external_services.pinecone import get_pinecone_index | |||
from .external_services.postgre_vector import get_postgre_cursor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
postgre-> postgres
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be a way to switch between the existing implementation with Pinecone and the new one with Postgres. Some level of abstraction is needed for that.
README.md
Outdated
*** | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is that space for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me remove one line. is it ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need these 3 new lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah this is my mistake, let me remove them.
register_vector(cur) | ||
|
||
# cur.execute('DROP TABLE IF EXISTS embedding') | ||
cur.execute('CREATE TABLE IF NOT EXISTS embedding (id bigserial PRIMARY KEY, namespace text, chunk_id text, metadata text, values vector)') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we really do that during every connection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you know app will not create table if there is already exist. Let me find best way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I know that.
|
||
conn = None | ||
try: | ||
conn = psycopg2.connect( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should create a connection only when it's needed, not when Python modules are loaded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it
except: | ||
logging.exception("Couldn't insert embeddings") | ||
|
||
|
||
def delete_pinecone_embedding(embeddings: list[Embedding], pinecone_index, pinecone_namespace): | ||
def delete_db_embedding(embeddings: List[Embedding], postgres_cursor, namespace): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we make it abstract?
from .external_services.openai import create_embeddings, summarize_thread_with_chat_gpt_3_5 | ||
import datetime | ||
from .external_services.slack_api import fetch_thread_messages, fetch_channel_messages, is_thread, \ | ||
is_actual_message, \ | ||
slack_names_map, filter_messages, load_previous_messages, load_subsequent_messages | ||
|
||
import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not used in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me remove
No description provided.