LOF Makerspace

Benchmarking AI models for trusted medication data in healthcare

‍

MedCompare is an AI-driven benchmarking system designed to evaluate the accuracy and completeness of medication information provided by various large language models (LLMs), including Gemini, DeepSeek, LLaMA 4. A fourth LLM, ChatGPT 4o, was used to independently evaluate and score medication data found in the other three. What sets this project apart is its emphasis on validating LLM-generated outputs against trusted sources of medical truth ensuring that any AI-enhanced decision-making in healthcare rests on reliable foundations.

‍

Key Features

Semantic Similarity Scoring: Utilizes advanced algorithms to assess the alignment between LLM outputs and official drug information.
Fuzzy Matching: Detects minor discrepancies in drug names and codes to ensure data integrity.
Batch Processing: Evaluates multiple medications simultaneously for greater scalability.
User-Friendly Interface: Intuitive design for clinicians and researchers to explore results easily.

Get in touch

Want In?

Members receive access to the full Leap of Faith ecosystem: AI tools, implementation support, and specialized engineering resources. Let's talk about how we can accelerate your AI strategy.

Apply for Membership

Book a call

Med Datasource Comparison

Benchmarking AI models for trusted medication data in healthcare

‍

Key Features

See our work

NLP Service for Processing Reports

Investigate & Implement OpenMRS

Discharge Instructions Extract

Want In?

Med Datasource Comparison

Benchmarking AI models for trusted medication data in healthcare​

‍

Key Features

See our work

NLP Service for Processing Reports

Investigate & Implement OpenMRS

Discharge Instructions Extract

Want In?

Benchmarking AI models for trusted medication data in healthcare