Transparency: Health Trends

4 janvier 2026

Health Trends is an opt-in feature that turns your EmoBay interactions into a private timeline of patterns: sentiment changes over time, emotional distribution, recurring themes, and an optional narrative reflection.

This feature is designed for personal insight—not diagnosis. It does not replace clinical care, and we intentionally present results as “signals” and “trends,” with clear caveats when data is limited.

1) Consent and eligibility

Health Trends requires explicit consent. It is available to registered users and is gated by authentication. Consent is stored server-side and checked before any report generation occurs.

You can disable Health Trends at any time. Disabling stops new report generation. You may also delete your saved report (charts + derived metrics + narrative). Account deletion is the strongest option and removes associated data.

2) What data is analyzed

Health Trends analyzes the same message history you create while using EmoBay. Reports are built from your own user + assistant messages over a time window (often “all history” for the full report).

For transparency: parts of this pipeline use deterministic heuristics (like sentiment scoring), and parts use a language model to produce structured summaries for aggregation. Both are described below.

3) Architecture at a glance

High-level flow (server-side)

Gather chats for your account (optionally excluding calls in “auto” mode)
Summarize each conversation into structured JSON (LLM)
Store per-chat summaries for reuse (so future runs are faster)
Run timeline analyses over your user messages:
- Sentiment per day + 7-day moving average (VADER)
- Emotion counts per day (lexicon-based; currently English-first)
- Topics (TF-IDF over user text) + topic categories (from summaries)
Save a report snapshot (charts + metrics)
Optionally generate a warm narrative reflection (LLM) and save it into the report

4) Conversation summarization (structured)

We summarize each conversation into a strict JSON structure to make aggregation predictable. The summarizer is constrained to output:

A neutral short summary (roughly a paragraph)
2–3 key points
A small set of representative quotes
1–3 topics chosen from a fixed category list (e.g., Work, Study, Relationship, Mental Health…)
A set of 0–1 “wellbeing dimension” scores (stress, anxiety, mood, energy, sleep, social)

To keep processing bounded, extremely long conversations are truncated to a fixed maximum message count by keeping the beginning and the end of the chat. System-only and empty content is excluded.

5) Timeline metrics (heuristics)

In addition to the per-chat structured summaries, the report includes “timeline-style” computations:

Sentiment timeline is computed from your user messages using VADER sentiment scoring and then aggregated per calendar day. We also compute a 7-point moving average to reduce noise.

Emotional distribution is computed via a lexicon-based counter across eight basic emotions. This is intentionally simple and conservative, and today it is most accurate for English text. It is a trend signal, not a psychological assessment.

Topics are extracted using TF-IDF (including bigrams) with an expanded stopword list and a “meaningful topic” heuristic to prefer domain-relevant terms (work, school, relationships, etc.).

6) Long-running analysis and resumability

A full-history report can involve many conversations. To make this feasible in a serverless environment, analysis is chunked by date ranges and processed in batches with retry/backoff logic. If the job approaches a timeout, we store progress and allow you to run “Update Report” to continue later rather than failing silently.

When the report completes, we may send a push notification (best-effort) so you can return to view charts and insights.

7) Narrative insights (optional, generated)

The “Personal Insights” narrative is generated only when you request it. We build a compact payload from the report (metrics + top topics + a limited quote set + conversation summaries) and prompt a language model to write a short supportive reflection. The narrative is saved back into your private report so it can be displayed consistently across sessions.

The narrative is designed to feel human and encouraging. It is not a diagnosis, and it should not be treated as medical advice.

8) Privacy, retention, and deletion

Health Trends is server-side and requires authentication.

Deleting your report removes the saved report snapshot (charts/metrics/narrative). Per-conversation summaries may remain so you don’t need to re-summarize everything from scratch the next time you run analysis. If you want a full purge, deleting your account removes associated data at the database level.

9) Limitations

Health Trends is a signal, not a truth machine. Short texts, sarcasm, code-switching, and multilingual inputs can reduce accuracy. We surface low-data notes when you have limited interactions, and we are actively improving multilingual support and robustness.

Transparency: Health Trends

4 janvier 2026

1) Consent and eligibility

Health Trends requires explicit consent. It is available to registered users and is gated by authentication. Consent is stored server-side and checked before any report generation occurs.

2) What data is analyzed

3) Architecture at a glance

High-level flow (server-side)

Gather chats for your account (optionally excluding calls in “auto” mode)
Summarize each conversation into structured JSON (LLM)
Store per-chat summaries for reuse (so future runs are faster)
Run timeline analyses over your user messages:
- Sentiment per day + 7-day moving average (VADER)
- Emotion counts per day (lexicon-based; currently English-first)
- Topics (TF-IDF over user text) + topic categories (from summaries)
Save a report snapshot (charts + metrics)
Optionally generate a warm narrative reflection (LLM) and save it into the report

4) Conversation summarization (structured)

We summarize each conversation into a strict JSON structure to make aggregation predictable. The summarizer is constrained to output:

A neutral short summary (roughly a paragraph)
2–3 key points
A small set of representative quotes
1–3 topics chosen from a fixed category list (e.g., Work, Study, Relationship, Mental Health…)
A set of 0–1 “wellbeing dimension” scores (stress, anxiety, mood, energy, sleep, social)

To keep processing bounded, extremely long conversations are truncated to a fixed maximum message count by keeping the beginning and the end of the chat. System-only and empty content is excluded.

5) Timeline metrics (heuristics)

In addition to the per-chat structured summaries, the report includes “timeline-style” computations:

Sentiment timeline is computed from your user messages using VADER sentiment scoring and then aggregated per calendar day. We also compute a 7-point moving average to reduce noise.

Topics are extracted using TF-IDF (including bigrams) with an expanded stopword list and a “meaningful topic” heuristic to prefer domain-relevant terms (work, school, relationships, etc.).

6) Long-running analysis and resumability

When the report completes, we may send a push notification (best-effort) so you can return to view charts and insights.

7) Narrative insights (optional, generated)

The narrative is designed to feel human and encouraging. It is not a diagnosis, and it should not be treated as medical advice.

8) Privacy, retention, and deletion

Health Trends is server-side and requires authentication.

1) Consent and eligibility

2) What data is analyzed

3) Architecture at a glance

High-level flow (server-side)

4) Conversation summarization (structured)

5) Timeline metrics (heuristics)

6) Long-running analysis and resumability

7) Narrative insights (optional, generated)

8) Privacy, retention, and deletion

9) Limitations

Chargement d'EmoBay

1) Consent and eligibility

2) What data is analyzed

3) Architecture at a glance

High-level flow (server-side)

4) Conversation summarization (structured)

5) Timeline metrics (heuristics)

6) Long-running analysis and resumability

7) Narrative insights (optional, generated)

8) Privacy, retention, and deletion

9) Limitations