Introduction: A New Paradigm of Human-AI Collaboration
In an era of rapidly advancing artificial intelligence (AI) technology, the concept of “Human-in-the-Loop (HITL)” has become increasingly important. This refers to a framework in which humans continuously participate in, supervise, and can intervene in the decision-making processes of AI systems and algorithms. Rather than pursuing complete automation, this approach integrates human expertise and judgment into critical decisions. Particularly in the pharmaceutical and medical device sectors, this approach is positioned not only as a best practice but also as a regulatory requirement due to the critical nature of life-affecting decisions.
Fundamental Concepts of Human-in-the-Loop
Definition and Key Characteristics
HITL is a system where AI provides suggestions and analyses while humans make final judgments and approvals. This is not merely a technical approach but rather a framework that enables humans and AI to work collaboratively.
Key characteristics include the following points. First is the retention of final decision-making authority by humans: in critical decisions, humans maintain ultimate authority. Second is the opportunity for human verification of AI outputs: humans can review results generated by AI and make corrections as necessary. Third is continuous learning and feedback: AI learns from human judgments, improving the entire system. Fourth is ensuring transparency and explainability: decision-making processes remain comprehensible, enabling accountability.
Implementation Levels of HITL
HITL exists at three levels depending on the degree of human involvement.
Human-in-Command (Human-Led): AI remains in a supporting role while humans maintain complete control. This applies to diagnostic support AI and literature search assistance. A typical example is when physicians make diagnoses based on their own judgment while referring to AI suggestions.
Human-on-the-Loop (Human Supervision): AI executes primary processing while humans monitor and intervene. This applies to manufacturing process monitoring and anomaly detection systems. During normal operations, AI performs automatic processing, but humans intervene during anomalies or critical decisions.
Human-in-the-Loop (Human Involvement): AI and humans truly collaborate, with human involvement essential for critical decisions. This applies to clinical trial data analysis and regulatory judgments, combining AI’s analytical capabilities with human expertise to derive optimal results.
Regulatory Requirements in the Pharmaceutical and Medical Device Sectors
Global Regulatory Trends
In the pharmaceutical and medical device sectors, regulatory authorities in various countries emphasize HITL approaches.
FDA (U.S. Food and Drug Administration)
On January 6, 2025, the FDA published its draft guidance titled “Artificial Intelligence-Enabled Device Software Functions: Lifecycle Management and Marketing Submission Recommendations.” This comprehensive guidance positions HITL as a critical element by mandating human factors engineering evaluation. Additionally, in December 2024, the agency issued its final guidance on Predetermined Change Control Plans (PCCPs).
Particularly notable is Project Elsa, which officially launched on June 2, 2025. This represents a concrete example of HITL implementation in pharmacovigilance operations. In this system, AI performs literature screening and summarization, supporting adverse event report summaries and streamlining clinical protocol reviews. Critically, it operates within a high-security GovCloud environment while maintaining final human decision-making authority.
However, during early deployment, quality issues emerged, including output variability and version management challenges. Some users reported instances of AI “hallucination” (generating nonexistent studies or misrepresenting research), emphasizing the importance of human supervision. To address these concerns, citation requirements were implemented when using document libraries. These experiences demonstrate both the theoretical and practical necessity of HITL approaches. Continuous functional improvements are planned within 30-day cycles based on human feedback.
EU (European Union)
The AI Act (Regulation EU 2024/1689), which entered into force on August 1, 2024, mandates human supervision for high-risk AI systems. Medical AI is automatically classified as high-risk under Article 6(1)(b), with medical devices classified as MDR Class IIa or higher automatically treated as high-risk AI systems.
Originally, compliance obligations for high-risk AI systems embedded in medical devices were scheduled to apply from August 2, 2027. However, with the Digital Omnibus package proposed in November 2025, the EU Commission introduced a mechanism linking implementation to the availability of harmonized standards, common specifications, and guidance. Under this revised framework, requirements for high-risk AI systems that are medical devices (Article 6(1) and Annex I) would apply 12 months after the Commission confirms availability of compliance support measures, with a backstop deadline of August 2, 2028. This adjustment recognizes that harmonized standards necessary for compliance have not materialized on schedule and that conformity assessment infrastructures are not yet fully operational. A framework has been established whereby systems cannot operate without appropriate human supervision.
Japan
The Ministry of Health, Labour and Welfare recommends HITL approaches for evaluating AI medical devices. Particularly for diagnostic support systems, approval processes have been established premised on final judgment by physicians.
Learning from FDA Project Elsa: Practice and Lessons in HITL
Concrete Achievements and Challenges of Project Elsa
FDA Project Elsa, which began operations on June 2, 2025, provides a valuable case study demonstrating the importance of HITL.
Concrete Achievements: Review time reduced from hours to days in some cases. Pharmacovigilance operations became more efficient. Through automation of literature screening and summarization, experts can focus on more advanced judgments.
Challenges Identified During Initial Deployment: Output quality variability and version management issues occurred. These quality problems reconfirmed the importance of human supervision. As a countermeasure against hallucination (AI generating false information), mandatory citation requirements were implemented when using document libraries. According to CNN reporting in July 2025, FDA employees found instances where Elsa cited nonexistent studies or miscounted products with specific labels, and when corrected, the AI would apologize but still fail to provide accurate answers. These experiences demonstrate that HITL approaches are essential, proving that both AI efficiency and human judgment capability must coexist. Continuous functional improvements within 30-day cycles continue based on human feedback.
Practical Applications in the Real World
1. AI Diagnostic Support Systems
In radiological image diagnosis, AI analyzes images and detects and marks areas with potential abnormalities. However, it only presents diagnostic candidates, with final diagnosis and treatment decisions always made by physicians. This system reduces the risk of oversight while improving diagnostic efficiency and accuracy.
2. Pharmacovigilance Systems
FDA Project Elsa provides a typical example. In adverse event detection, AI detects potential signals from big data and provides literature summaries. However, medical evaluation of those signals and response decisions are always made by pharmacovigilance specialists. Quality problems during initial deployment further proved the importance of this final human judgment.
3. Quality Control Systems
In pharmaceutical factory quality control, AI analyzes manufacturing data in real time to detect anomalies. Quality personnel perform root cause analysis on anomalies detected by AI and determine corrective actions. This enables early detection of quality problems and appropriate responses.
4. Clinical Trial Data Analysis
In analyzing large-scale clinical trial data, AI extracts statistical patterns and detects potential efficacy and safety signals. However, evaluation of clinical significance and regulatory judgments are jointly performed by medical experts and statistical specialists.
Critical Implementation Points
1. Ensuring Transparency and Countering Hallucination
Making AI decision-making rationale explainable in forms humans can understand is essential. In FDA Project Elsa, citation requirements were made mandatory when using document libraries to prevent hallucination. By avoiding black-box processes and ensuring decision-making transparency, humans can make appropriate intervention judgments.
2. Human Intervention at Appropriate Timing
Setting intervention points according to risk levels is important. Low-risk judgments are delegated to AI, while human involvement is mandatory for high-risk decisions. Initial quality problems with FDA Project Elsa highlighted the importance of these judgment criteria. Mechanisms enabling immediate human decisions during emergencies and construction of phased escalation processes are necessary.
3. Maintaining Human Capabilities and Continuous Improvement
Continuous education and training are essential to prevent skill degradation due to AI dependency. Project Elsa has planned functional additions within 30 days, incorporating continuous improvement based on human feedback. Both acquiring new skills for utilizing AI and maintaining/improving professional knowledge must be balanced.
Benefits and Challenges
Benefits of HITL
Enhanced Safety: Final human checks prevent serious consequences from AI misjudgments. Initial quality problems with Project Elsa demonstrated the importance of this safety mechanism.
Clarity of Accountability: With humans bearing final responsibility, legal and ethical responsibility locations become clear. This is essential from medical malpractice and product liability perspectives.
Regulatory Compliance: Correspondence with regulatory requirements from FDA, EU, and Japan enables smooth approval acquisition and market introduction. This is also important for preparation toward the August 2028 backstop deadline for EU AI Act implementation.
Continuous Improvement: AI performance improves through human feedback, enhancing overall system quality. Project Elsa’s continuous improvement process demonstrates this.
Challenges Faced
Balance with Efficiency: While human intervention increases processing time, Project Elsa achieved reductions from hours to days, finding an appropriate balance.
Quality Control Complexity: Quality problems occurring during Project Elsa’s initial deployment demonstrate the complexity of AI system version management and quality assurance.
Increased Costs: Continuous securing and training of personnel with specialized knowledge incurs costs. Particularly in medical fields, high specialization requirements can easily lead to high personnel costs.
Human Error Risk: Risks exist of decreased attention due to excessive AI dependency and judgment errors due to fatigue. Appropriate workload management is necessary.
Future Prospects
Technological Development
Progress in explainable AI enables more effective HITL implementation. Technological innovations continue, including setting optimal intervention points through risk-based approaches and improving prediction accuracy through linkage with digital twins.
Evolution of Regulatory Environment
Standardization of HITL requirements is progressing, including August 2028 EU AI Act implementation (with backstop deadline) and continuous FDA guidance updates. International guideline unification is being pursued, and establishment of validation methods is expected to enable more systematic quality assurance of HITL systems.
Adaptive Intervention Systems
Systems are being developed that dynamically adjust human intervention levels according to AI confidence levels. By automatically processing cases with high confidence and requesting human judgment only for uncertain cases, optimal balance between efficiency and reliability can be achieved.
Conclusion: Proven Value of HITL and Path to the Future
Human-in-the-Loop is a critical approach for building safe and reliable systems while balancing AI technological progress with regulatory requirements. The example of FDA Project Elsa clearly demonstrates not only HITL’s theoretical importance but also its practical necessity. Quality problems during initial deployment proved the limits of complete automation and the indispensability of human supervision.
Particularly in the pharmaceutical and medical device sectors, this approach is becoming increasingly important due to the critical nature of life-affecting decisions. The regulatory environment is also moving toward premises of HITL approaches, including August 2028 EU AI Act implementation (backstop deadline), continuous FDA guidance development, and similar frameworks being established globally.
Rather than pursuing complete automation, a future where humans and AI collaborate by leveraging their respective strengths leads to safer and more trustworthy healthcare realization. While technological progress continues unabated, human expertise, ethical values, and sense of responsibility represent irreplaceable value no matter how advanced AI becomes.
HITL is not merely a technical approach but also a philosophical guideline for utilizing AI’s benefits to the fullest while preserving human dignity and professionalism. As Project Elsa’s experience demonstrates, we now stand at the entrance to a new era of healthcare where AI and humans truly collaborate. The key to that success is precisely the concept of Human-in-the-Loop.
Table: Comparison of HITL Implementation Levels
| Level | AI Role | Human Role | Typical Applications | Decision Authority |
|---|---|---|---|---|
| Human-in-Command | Support/Recommendation | Primary decision-maker | Diagnostic support AI, Literature search | Complete human control |
| Human-on-the-Loop | Primary processing | Monitoring/Intervention | Manufacturing monitoring, Anomaly detection | AI automation with human oversight |
| Human-in-the-Loop | Co-processing | Essential participant in critical decisions | Clinical trial analysis, Regulatory decisions | Collaborative decision-making |
Table: Key Regulatory Milestones
| Date | Regulatory Authority | Event | Impact on HITL |
|---|---|---|---|
| December 2024 | FDA | Final Guidance on PCCP published | Established framework for AI device updates |
| January 6, 2025 | FDA | Draft Guidance on AI-DSF published | Mandated human factors engineering evaluation |
| June 2, 2025 | FDA | Project Elsa launched | Demonstrated practical HITL implementation |
| August 2, 2024 | EU | AI Act entered into force | Mandated human oversight for high-risk AI |
| August 2, 2027 (originally) | EU | AI Act high-risk requirements for medical devices | Compliance deadline for existing systems |
| August 2, 2028 (backstop) | EU | AI Act extended deadline per Digital Omnibus | Extended compliance timeline recognizing implementation challenges |
Table: Project Elsa Timeline and Key Learnings
| Phase | Timeline | Key Features | Lessons Learned |
|---|---|---|---|
| Pilot | Before June 2025 | Testing with scientific reviewers | Established basic functionality requirements |
| Launch | June 2, 2025 | Agency-wide deployment in GovCloud | Achieved ahead of schedule, under budget |
| Early Issues | June-July 2025 | Quality problems identified | Hallucination, version management challenges |
| Countermeasures | July 2025 onwards | Citation requirements implemented | Human oversight proved essential |
| Continuous Improvement | Ongoing (30-day cycles) | Feature additions based on feedback | Demonstrated value of human-AI collaboration |
Comment