Skip to content

fix(medcat-trainer): Handle where SOLR contains collections aren't from trainer#536

Merged
alhendrickson merged 1 commit into
mainfrom
fix/medcat-trainer/reused-solr-instance
Jun 11, 2026
Merged

fix(medcat-trainer): Handle where SOLR contains collections aren't from trainer#536
alhendrickson merged 1 commit into
mainfrom
fix/medcat-trainer/reused-solr-instance

Conversation

@alhendrickson

@alhendrickson alhendrickson commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

This fix should sort out the CDB import in k8s. Trainer right now is failing if SOLR is used for anything else - every collection must have the field CUI in the current impl. In k8s it initialises a SOLR collection "my-collection" that doesnt have a field called CUI so it breaks it.

Right now when I save a project it hits this error.

[medcattrainer] INFO 2026-06-11 14:12:10,012 solr_utils.py l:46:Retrieving all SOLR collections: http://cogstack-solr:8983/solr/admin/collections?action=LIST
[medcattrainer] INFO 2026-06-11 14:12:10,024 solr_utils.py l:32:Retrieving solr schema: http://cogstack-solr:8983/solr/my-collection/schema
[medcattrainer] ERROR 2026-06-11 14:12:10,040 views.py l:862:Failed to search for concept_search_index. Solr Search Service not available
[medcattrainer] Traceback (most recent call last):
[medcattrainer]   File "/home/api/api/views.py", line 860, in concept_search_index_available
[medcattrainer]     return collections_available(cdb_ids)
[medcattrainer]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[medcattrainer]   File "/usr/local/lib/python3.12/contextlib.py", line 81, in inner
[medcattrainer]     return func(*args, **kwds)
[medcattrainer]            ^^^^^^^^^^^^^^^^^^^
[medcattrainer]   File "/home/api/api/solr_utils.py", line 52, in collections_available
[medcattrainer]     _cache_solr_collection_schema_types(col)
[medcattrainer]   File "/usr/local/lib/python3.12/contextlib.py", line 81, in inner
[medcattrainer]     return func(*args, **kwds)
[medcattrainer]            ^^^^^^^^^^^^^^^^^^^
[medcattrainer]   File "/home/api/api/solr_utils.py", line 35, in _cache_solr_collection_schema_types
[medcattrainer]     cui_type = [n for n in resp['schema']['fields'] if n['name'] == 'cui'][0]['type']
[medcattrainer]                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
[medcattrainer] IndexError: list index out of range
[medcattrainer] Internal Server Error: /api/concept-db-search-index-created/
[medcattrainer] ERROR 2026-06-11 14:12:10,042 log.py l:253:Internal Server Error: /api/concept-db-search-index-created/

The fix is to make it ignore collections that dont have the field 'cui' present.

I'll separately update the helm chart to also not make the collection, but this is the real fix here

@tomolopolis tomolopolis left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@alhendrickson alhendrickson merged commit c34c097 into main Jun 11, 2026
10 checks passed
@alhendrickson alhendrickson deleted the fix/medcat-trainer/reused-solr-instance branch June 11, 2026 15:41
@alhendrickson

Copy link
Copy Markdown
Collaborator Author

Helm chart also fixed on CogStack/cogstack-platform#97

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants