Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE] Mixtral 8x7B is not responding with "right response" #646

Open
elegos opened this issue Sep 17, 2024 · 0 comments
Open

[ISSUE] Mixtral 8x7B is not responding with "right response" #646

elegos opened this issue Sep 17, 2024 · 0 comments

Comments

@elegos
Copy link

elegos commented Sep 17, 2024

Describe your issue

I'm trying to work with Devika running the WukongV2-Mixtral-8x7B-V0.1-i1-GGUF model locally with the following tunings:

  • Context length: 10240
  • GPU Offload: 6 / 32
  • Evaluation Batch Size: 512
  • System prompt: (tried empty, enforcing via Your answers must always start with ~~~ and end with ~~~, no apparent changes)
  • CPU threads: 12
  • Temperature: 0.4
  • Prompt template:
{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '
' + message['content'] + '<|im_end|>' + '
'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
' }}{% endif %}

The problem is that Devika is not able to understand the model's response, probably because it's not replying with the starting and ending tildes (and I think that the file name definition is different, too). Is there any way I could enforce the model response, or suggest me a different GGUF Mixtral-8x7B imatrix model that works with Devika?

Thank you very much

How To Reproduce

Steps to reproduce the behavior (example):

  1. run the model locally with the given details
  2. enter the following prompt (any?)

Write an application in Python which exposes a RESTful interface written in FastAPI, using pipenv as virtual environment manager. The application will have a register form to let users create their own accounts (user and email are unique, other fields are display name and password). The application will support OAuth2 login and JWT as token to authenticate the logged user. Source files should have separated folders depending on their component type, like controllers/ models/ services/ etc.

Expected behavior

Devika should accept the model's answers. The model doesn't seem to add the starting and ending ~~~

Screenshots and logs

(added escaping backslashes)

24.09.17 11:13:17: root: DEBUG  : Response from the model: \```python
# File: `main.py`
import sqlalchemy as sa
from fastapi import FastAPI, HTTPException
from models import User

app = FastAPI()

# Step 3 and 4
db_engine = sa.create_engine('sqlite:///users.db')
Base = sa.ext.declarative.base
Base.metadata.create_all(db_engine)
SessionLocal = sa.orm.sessionmaker(autocommit=False, autoflush=False, bind=db_engine)

# Step 5
@app.post("/users/register")
async def register_user(username: str, email: str, password: str):
    user = User(username=username, email=email, hashed_password=hash_password(password))
    session = SessionLocal()
    try:
        session.add(user)
        session.commit()
        return {"message": "User registered successfully"}
    except sa.exc.IntegrityError as e:
        raise HTTPException(status_code=409, detail="Username or email already exists")
\```

\```python
# File: `models/user.py`
from sqlalchemy import Boolean, Column, Integer, String
from sqlalchemy.orm import declarative_base

Base = declarative_base()

class User(Base):
    __tablename__ = "users"

    id = Column(Integer, primary_key=True, index=True)
    username = Column(String, unique=True, index=True)
    email = Column(String, unique=True, index=True)
    hashed_password = Column(String)
\```
Invalid response from the model, I'm trying again...
24.09.17 11:13:17: root: INFO   : SOCKET info MESSAGE: {'type': 'warning', 'message': 'Invalid response from the model, trying again...'}
24.09.17 11:13:19: root: INFO   : SOCKET tokens MESSAGE: {'token_usage': 7837}

Configuration

- OS: Linux Fedora
- Python version: 3.12.5
- Node version: 19.9.0
- bun version: 1.1.27
- search engine: DuckDuckGo
- Model: WukongV2-Mixtral-8x7B-V0.1-i1-GGUF running on LM Studio (with added Devika adapter as of https://github.com/stitionai/devika/pull/389

Additional context

Nothing in here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant