HR AI Assistant: Drove activation from 10% → 90% by making AI trustworthy & trainable

Context

After leading the company pivot and proving the concept from Mar'23 to Jun'24 (Case study: Led 0→1 GenAI people analytics to enterprise-ready), I shifted focus to increasing activation to ensure long-term user retention.

Timeline

Jul'24 - Feb'25

My role

• I worked as the sole product designer, led the product design
• To identify pain points and validate solutions, I conducted continuous user research—attending 2–5 onboarding meetings per week, speaking with 10+ users weekly

Team

1 product designer (me)
1 backend/model engineer
1 frontend engineer
CTO, CEO

Overview

Problem: 90% of users dropped off after asking a few questions, due to lack of trust, control, and unrealistic AI expectations

Users asked a few workforce-related questions but didn’t continue using the product. Through user research, we discovered that the biggest barriers were:
1. Users didn’t understand how AI generated answers.
2. Users thought AI was wrong when results didn’t match expectations.
3. Users couldn’t correct mistakes or refine AI interpretations.

Solutions: Built trust with AI transparency and shifted user's mental model through AI training

I shifted users' mental model by introducing AI training tools, enabling them to provide constructive feedback and define custom terms, while building trust through transparent, progressive explanations —turning AI from a "black box" into a trainable system that encouraged long-term engagement.

Outcome: Increased activation from 10% → 90%

Users regularly asked questions and returned within the first 30 days. Reduced early churn and increased adoption across HR teams.

Phase 1: Increasing Activation from 10% → 30% with AI Transparency

Problem: Users didn’t trust AI calculations → They stopped using it

Users would ask a few workforce-related questions, but when they saw AI-generated numbers, they couldn’t verify if they were correct. If the response seemed “off,” they lost confidence and didn’t return.

Initial attempt: adding more context in responses made it too long

To improve trust, I collaborated with the model engineer to add more contextual details directly in the AI response, such as time range, applied filters, and field mappings. However, this made responses long, and users still wanted even more detailed explanations around how numbers were calculated.

Solution: Progressive explanation exposure

Instead of cluttering responses, I introduced progressive disclosure to provide details only when needed. Not all responses raised doubts, and users often only wanted to verify specific answers, so making explanations on-demand reduced unnecessary cognitive load. Generating longer explanations also increased response time, so this approach balanced detail and efficiency. To support seamless implementation, I provided a detailed template for the model engineer, ensuring consistency in how explanations were structured and surfaced.

Impact

• Increased ongoing activation from 10% → 30%, as users gained confidence in AI calculations and continued asking questions.
• Reduced time spent explaining AI-generated answers by 80%, since users could self-verify results.

Phase 2: Increasing Activation from 30% → 60% with Precise Data Control

Problem: Users thought AI was “wrong” because default data didn’t match their expectations

• Users expected different numbers for workforce metrics like headcount or attrition, as their internal definitions varied.
• By default, the system included all employees (e.g., full-time, part-time, interns, contractors), while users were used to filtered versions of these metrics.
• This mismatch in expectations led users to believe AI-generated numbers were incorrect, even though the calculations were technically accurate.

Explorations: Finding the right approach for data control

To help users get accurate answers aligned with their internal definitions, I explored three approaches to managing data pools. After testing three approaches, I prioritized user flexibility and real-time adjustments over rigid pre-set data pools, leading to the most effective solution.

Approach 1: Defining a default data pool during onboarding

Initially, I considered letting users define a default headcount pool during onboarding.

Why I ruled it out:
This approach lacked flexibility—users sometimes needed to include or exclude certain groups (e.g., interns, part-time employees) depending on the question.

Approach 2: Allowing users to define multiple data pools and select one per query

User feedback showed issues:
• It's easy for users to forget the definitions of their pre-set pools, making it difficult to know which one to use.
• This added cognitive load and led to uncertainty rather than efficiency.

Final solution: Adding exclusion filters for direct data control

Instead of requiring pre-set data pools, I allowed users to dynamically exclude specific groups (e.g., interns, part-time employees) within the query interface.

Why this worked best:
• Users could see and adjust exclusions in real time, ensuring they understood exactly what data was included.
• It provided flexibility without relying on memory of predefined pools.
• It aligned with how users naturally thought about workforce metrics—starting with all employees, then filtering down as needed.

Impact

• Activation increased from 30% → 60%, as users could now align AI-generated answers with their internal definitions.
• Reduced query rework time by 20%, as users no longer had to define the pool in natural language for each query.

Phase 3: Increasing Activation from 60% → 90% with AI Training & Custom Terms

Problem: Users couldn’t correct AI mistakes or fix misinterpretations

• Even with transparency and filtering, users still encountered misinterpretations (e.g., “Manager” as a job function vs. job title).
Users expected AI to be magic and didn’t realize they could train it to better reflect their internal definitions.
• While users engaged with the feedback tools (e.g., thumbs-down reactions), they provided very limited information, leading to slow AI improvements.

Explorations: AI-generated chips

After discussing with the model engineer, I weighed the pros and cons of this approach and reached the conclusion that:
• Chips weren’t scalable – Every company had different definitions for job roles, making it impossible to provide universally relevant options.
AI wasn’t smart enough to recommend the right mappings because each company had unique data structures and expectations.

Based on these findings, I determined that a user-driven approach was more effective than AI-suggested options.

Final solution: AI training & custom terms

1. Shifted users’ mental model through onboarding training
I worked with the CEO and CTO to change how AI was introduced—instead of marketing AI as magical and perfect, we framed it as trainable.
The onboarding flow taught users to refine AI responses, encouraging them to actively engage in training AI.

2. Designed structured feedback loops
Instead of a simple thumbs-down, I introduced a guided feedback system tailored to common user issues I identified from past use cases.

3. Enabled users to define and train custom terms
I designed a system that lets users define mappings in natural language, enabling AI to store and apply these definitions consistently across future queries.

Impact

Activation increased from 60% → 90%, as users actively trained AI and continued using the product.
• Eliminated 40+ hours per week of manual engineering work by automating AI corrections—users could now clarify errors and mappings directly, without requiring intervention.

Learnings & Next Steps

Learnings

• Trust requires transparency and control – Users need progressive explanations to understand AI logic and precise data control to align results with their internal definitions.
• AI should be trainable, not assumed perfect – Initially, users expected AI to be perfectly accurate. However, engagement significantly increased once users understood the AI as an adaptive system they could actively refine and improve.
User-driven customization is more effective than AI-driven suggestions – AI-generated mappings weren’t scalable due to company-specific differences, making custom terms and structured feedback loops the better solution.

Next steps

• Automate AI training based on user corrections – Reduce manual effort by having AI learn from past feedback and improve mapping suggestions over time.
Introduce suggested follow-up questions – Help users discover relevant insights by recommending follow-up queries based on their previous questions.