Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Semantic splitter #63

Merged
merged 17 commits into from
Mar 2, 2024
Merged

Conversation

simjak
Copy link
Contributor

@simjak simjak commented Feb 25, 2024

  • Added Semantic Splitter
  • Integrated semantic splitter to with Unstructured elements grouping by title
  • Include fake title detection
  • Additional improvements (BaseChunk model, batch embeddings, batch uploads)

Follow the walkthrough.ipynb for how to use
image
image

@simjak simjak changed the title Simonas/semantic splitter feat: Semantic splitter Feb 25, 2024
@homanp
Copy link
Contributor

homanp commented Feb 26, 2024

Amazing @simjak

@homanp homanp self-assigned this Feb 26, 2024
@homanp homanp added the enhancement New feature or request label Feb 26, 2024
@homanp
Copy link
Contributor

homanp commented Feb 26, 2024

@simjak

getting the following after running poetry install

INFO:     Will watch for changes in these directories: ['/Users/ismailpelaseyed/Projects/super-rag']
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started reloader process [41325] using StatReload
Process SpawnProcess-1:
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/ismailpelaseyed/Projects/super-rag/.venv/lib/python3.11/site-packages/uvicorn/_subprocess.py", line 78, in subprocess_started
    target(sockets=sockets)
  File "/Users/ismailpelaseyed/Projects/super-rag/.venv/lib/python3.11/site-packages/uvicorn/server.py", line 62, in run
    return asyncio.run(self.serve(sockets=sockets))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/Users/ismailpelaseyed/Projects/super-rag/.venv/lib/python3.11/site-packages/uvicorn/server.py", line 69, in serve
    config.load()
  File "/Users/ismailpelaseyed/Projects/super-rag/.venv/lib/python3.11/site-packages/uvicorn/config.py", line 458, in load
    self.loaded_app = import_from_string(self.app)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ismailpelaseyed/Projects/super-rag/.venv/lib/python3.11/site-packages/uvicorn/importer.py", line 24, in import_from_string
    raise exc from None
  File "/Users/ismailpelaseyed/Projects/super-rag/.venv/lib/python3.11/site-packages/uvicorn/importer.py", line 21, in import_from_string
    module = importlib.import_module(module_str)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/Users/ismailpelaseyed/Projects/super-rag/main.py", line 6, in <module>
    from router import router
  File "/Users/ismailpelaseyed/Projects/super-rag/router.py", line 3, in <module>
    from api import delete, ingest, query
  File "/Users/ismailpelaseyed/Projects/super-rag/api/delete.py", line 4, in <module>
    from service.embedding import get_encoder
  File "/Users/ismailpelaseyed/Projects/super-rag/service/embedding.py", line 11, in <module>
    from semantic_router.encoders import (
  File "/Users/ismailpelaseyed/Projects/super-rag/.venv/lib/python3.11/site-packages/semantic_router/__init__.py", line 1, in <module>
    from semantic_router.hybrid_layer import HybridRouteLayer
  File "/Users/ismailpelaseyed/Projects/super-rag/.venv/lib/python3.11/site-packages/semantic_router/hybrid_layer.py", line 6, in <module>
    from semantic_router.encoders import (
  File "/Users/ismailpelaseyed/Projects/super-rag/.venv/lib/python3.11/site-packages/semantic_router/encoders/__init__.py", line 8, in <module>
    from semantic_router.encoders.tfidf import TfidfEncoder
  File "/Users/ismailpelaseyed/Projects/super-rag/.venv/lib/python3.11/site-packages/semantic_router/encoders/tfidf.py", line 10, in <module>
    from semantic_router.route import Route
  File "/Users/ismailpelaseyed/Projects/super-rag/.venv/lib/python3.11/site-packages/semantic_router/route.py", line 5, in <module>
    from PIL.Image import Image
ModuleNotFoundError: No module named 'PIL'

@simjak
Copy link
Contributor Author

simjak commented Feb 27, 2024

Solved this semantic router version 0.0.25

@homanp
Copy link
Contributor

homanp commented Feb 27, 2024

Solved this semantic router version 0.0.25

Updated dependencies but still seeing the issue.

Copy link
Contributor

@homanp homanp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some small fixes and changes but seems to work well, good job! These changes have some braking changes with the encoder provider in the API so I need to first migrate to that before merging.

Or we should set the new splitter config as non mandatory and just set a default? Perhaps you could add that in?

simjak and others added 8 commits February 28, 2024 10:40
* Add support for queryig code interpreter

* Fix formatting

* Ensure the sandbox close is called on exceptions

* Update service/code_interpreter.py

Co-authored-by: Tomas Valenta <[email protected]>

* Update service/code_interpreter.py

Co-authored-by: Tomas Valenta <[email protected]>

* Update service/router.py

Co-authored-by: Tomas Valenta <[email protected]>

* Update service/code_interpreter.py

Co-authored-by: Tomas Valenta <[email protected]>

* Add system prompt

* Format code

* Bump dependencies

* Minor tweaks

---------

Co-authored-by: Tomas Valenta <[email protected]>
@homanp homanp merged commit ae3b113 into superagent-ai:main Mar 2, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants