Is your feature request related to a problem?
The topic relevance LLM validator incorrectly scores user queries in pure Hindi (Devanagari script) due to mixed scripts in the message. This leads to inconsistent scoring, causing confusion in the model's assessments.
Describe the solution you'd like
- Move scoring instructions to the system prompt
- Keep user messages as raw queries only
Original issue
Describe the bug
A clear and concise description of what the bug is.
The topic relevance LLM validator returns incorrect scores for user queries written in pure Hindi (Devanagari script). The same query written in romanized Hindi (Latin script) scores correctly.
The root cause is that scoring instructions were appended to the user message rather than placed in the system prompt. This caused the user message to be a mix of Devanagari script (the query) followed by English scoring rules — a script boundary within the same message that confuses the model. For romanized Hindi there is no script boundary, so it works correctly. The fix moves scoring instructions into the system prompt, keeping the user message as the raw query only.
To Reproduce
Steps to reproduce the behavior:
- Go to '...'
- Click on '....'
- Scroll down to '....'
- See error
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Add any other context about the problem here.
Is your feature request related to a problem?
The topic relevance LLM validator incorrectly scores user queries in pure Hindi (Devanagari script) due to mixed scripts in the message. This leads to inconsistent scoring, causing confusion in the model's assessments.
Describe the solution you'd like
Original issue
Describe the bug
A clear and concise description of what the bug is.
The topic relevance LLM validator returns incorrect scores for user queries written in pure Hindi (Devanagari script). The same query written in romanized Hindi (Latin script) scores correctly.
The root cause is that scoring instructions were appended to the user message rather than placed in the system prompt. This caused the user message to be a mix of Devanagari script (the query) followed by English scoring rules — a script boundary within the same message that confuses the model. For romanized Hindi there is no script boundary, so it works correctly. The fix moves scoring instructions into the system prompt, keeping the user message as the raw query only.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Add any other context about the problem here.