Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# Understanding the Stats Table in Glific

The Stats table is Glific's built-in analytics system. It automatically records activity data for your organisation across different time windows so you can track how your bot is performing over time.

---

## Time Periods

Every stat record belongs to one of five time periods. The same metrics are calculated for each period independently.

| Period | What it covers |
|---|---|
| **Hour** | A single hour of activity |
| **Day** | A full calendar day (midnight to midnight) |
| **Week** | A full calendar week (Monday to Sunday) |
| **Month** | A full calendar month |
| **Summary** | A special snapshot generated at the end of each week and month — it covers the entire week or month in one calculation and is the most reliable figure to use for reporting |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify how Summary period is used in the stats table documentation

rg -n -C3 'period.*summary|summary.*period' --type=md -g '!*Stats Table*'

Repository: glific/docs

Length of output: 885


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="docs/4. Product Features/09. Reporting & Dashboard/09. Understanding the Stats Table.md"

echo "== Lines 1-40 =="
nl -ba "$FILE" | sed -n '1,40p'

echo
echo "== All occurrences of 'summary' in target file (with context) =="
rg -n -C2 'summary' "$FILE" --type=md

echo
echo "== Where 'Summary' row is defined (best-effort) =="
rg -n 'Summary' "$FILE" --type=md

Repository: glific/docs

Length of output: 133


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="docs/4. Product Features/09. Reporting & Dashboard/09. Understanding the Stats Table.md"

echo "== Lines 1-60 =="
awk 'NR>=1 && NR<=60 {printf "%d:%s\n", NR, $0}' "$FILE"

echo
echo "== All occurrences of 'summary' (with context) =="
rg -n -C2 'summary' "$FILE" --type=md

echo
echo "== Occurrences of 'Summary' (with context) =="
rg -n -C2 'Summary' "$FILE" --type=md

Repository: glific/docs

Length of output: 6600


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="docs/8. FAQ/Glific BigQuery Tables Guide.md"

# Print around the line range where `period` is defined (from earlier rg output: ~535)
awk 'NR>=520 && NR<=560 {printf "%d:%s\n", NR, $0}' "$FILE"

Repository: glific/docs

Length of output: 1793


Clarify “Summary” period wording to match the period=summary schema

The line-17 text (“generated at the end of each week and month…covers the entire week or month”) reads like Summary represents two different aggregation types. Upstream schema defines period with a single value summary = “overall totals”, and this doc’s field definitions also treat Summary as all-time totals (e.g., contacts Summary returns your all-time total; users Summary returns total user accounts registered). Reword line 17 to explicitly say period=summary is overall/all-time totals, and only mention weekly/monthly timing if you clarify it’s when the all-time totals snapshot is refreshed (e.g., via the date field).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/4`. Product Features/09. Reporting & Dashboard/09. Understanding the
Stats Table.md at line 17, Reword the existing Summary description so it matches
the schema's `period=summary` meaning: replace the current wording that implies
weekly/monthly aggregation with a clear statement that `period=summary`
represents overall/all-time totals (e.g., `contacts` shows all-time total,
`users` shows total registered), and if you must mention weekly/monthly timing,
clarify those are only the refresh cadence for the snapshot (referencing the
`date` field) rather than different aggregation types.


> **Recommendation**: Always use the **Summary** period for reporting to stakeholders. It is calculated in a single pass and avoids the inconsistencies that can appear when comparing daily rows.
Comment on lines +7 to +19

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Consider briefly mentioning the date and hour fields for querying.

The Stats Table schema includes date and hour fields that users need when filtering records by time. While the current documentation focuses on explaining what each metric means, a brief note about these dimensional fields would help users understand how to query the data.

For example: "Each stat record also includes a date field (and hour field for hourly records) that indicates when the stats were recorded."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/4`. Product Features/09. Reporting & Dashboard/09. Understanding the
Stats Table.md around lines 7 - 19, Add a short note to the "Time Periods"
section of the Stats Table docs that mentions the dimensional fields used for
querying: state that each stat record includes a date field and, for hourly
rows, an hour field (e.g., "Each stat record also includes a `date` field and an
`hour` field for hourly records to help filter and group by time"). Update the
paragraph near the top of the Time Periods table (where periods are defined) to
include this sentence so readers know to use `date` and `hour` when querying the
stats table.


---

## Fields

### `contacts`

**What it means**: The number of new contacts who were added to your organisation during this period.

**How it is calculated**: Counts contacts whose creation date falls within the time window. A contact who joined on December 3rd will appear in the December 3rd daily stat, the December weekly stat, and the December monthly stat.

**What to watch for**: This is a count of *new* additions only, not your total contact base. The Summary period is an exception — it returns your all-time total contact count.

---

### `active`

**What it means**: The number of contacts who sent a message to the bot during this period.

**How it is calculated**: Counts contacts whose "last messaged at" date falls within the time window.

**Important limitation**: A contact's "last messaged at" date is always updated to their most recent interaction. This means if a contact was active on December 1st but also sent a message on December 15th, they will no longer appear in the December 1st daily count.

---

### `optin`

**What it means**: Among the contacts who were active during this period, how many have opted in to receive messages from your organisation.

**How it is calculated**: Takes the set of active contacts for the period (same as the `active` count) and filters down to those who have an opt-in recorded at any point in the past.

**Critical clarification — this is not "who opted in this period"**: This field does not tell you how many people opted in during this specific time window. It tells you how many of the *currently active* contacts have ever opted in. If you want to know how many contacts opted in during a specific month, the contacts table needs to be queried directly. This is also the most likely reason for discrepancies if you have been cross-checking opt-in numbers.

---

### `optout`

**What it means**: Among the contacts who were active during this period, how many have opted out.

**How it is calculated**: Same approach as `optin` — takes the active contacts and filters to those who have an opt-out recorded at any point in the past.

**Same caveat as optin**: This is not "who opted out this period." It is the overlap between active contacts and contacts who have ever opted out.

---

### `messages`

**What it means**: The total number of messages sent and received during this period, across both directions.

**How it is calculated**: Counts all message records (inbound + outbound) created within the time window.

---

### `inbound`

**What it means**: The number of messages sent *to* the bot by contacts during this period.

**How it is calculated**: Counts only messages where the flow direction is inbound (contact → bot).

**Use case**: This is the closest proxy to genuine user engagement — a contact actively chose to send a message.

---

### `outbound`

**What it means**: The number of messages sent *from* the bot to contacts during this period.

**How it is calculated**: Counts only messages where the flow direction is outbound (bot → contact).

---

### `hsm`

**What it means**: The number of template messages sent during this period. HSM stands for Highly Structured Message, which is WhatsApp's term for pre-approved message templates.

**How it is calculated**: Counts outbound messages that are marked as HSM/template messages.

**Why this matters**: Template messages have a cost associated with them and are used to initiate conversations outside the 24-hour session window. Tracking this helps monitor your messaging costs.

---

### `flows_started`

**What it means**: The number of times a flow (automated conversation sequence) was triggered during this period.

**How it is calculated**: Counts flow sessions that were created within the time window. One contact going through a flow counts as one flow started, even if they send multiple messages within it.

---

### `flows_completed`

**What it means**: The number of flows that were completed (reached the end) during this period.

**How it is calculated**: Counts flow sessions whose completion timestamp falls within the time window.

**Use case**: Comparing `flows_started` to `flows_completed` gives you a drop-off rate — the percentage of contacts who started a flow but did not finish it.

---

### `users`

**What it means**: The number of Glific staff/agent users (not contacts) who logged in during this period.

**How it is calculated**: Counts Glific user accounts whose last login falls within the time window. For the Summary period, it returns the total number of user accounts registered.

**Note**: This tracks your organisation's internal team activity, not the contacts using your bot.

---


## Common Pitfalls

1. **Do not use daily `active` counts for historical reporting.** They are revised downward as contacts continue to message after that date. Use monthly or summary figures instead.

2. **The `optin` and `optout` columns are not cohort counts.** They do not tell you who opted in or out during that period. They tell you which active contacts have an opt-in or opt-out on record.

3. **Weekly and daily figures may not add up perfectly.** Each is calculated independently, so small differences are expected. The Summary figure is always the authoritative number for a given period.

4. **`messages` = `inbound` + `outbound` (approximately).** There may be a small difference due to system-generated messages, but these three fields should roughly balance.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for documentation about message counting logic

rg -n -C3 'messages.*inbound.*outbound|message.*count' --type=md -g '*BigQuery*'

Repository: glific/docs

Length of output: 37


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE='docs/4. Product Features/09. Reporting & Dashboard/09. Understanding the Stats Table.md'

echo '--- target file: show lines 120-170 ---'
sed -n '120,170p' "$FILE" | nl -ba | sed -e 's/^/LINE /'

echo '--- repo search for field definitions/wording ---'
rg -n --hidden --no-ignore-vcs \
  '(Total messages|Messages received from contacts|Messages sent to contacts|system-generated|inbound|outbound|messages\s*=\s*inbound|messages.*inbound.*outbound)' \
  --type=md docs

Repository: glific/docs

Length of output: 156


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE='docs/4. Product Features/09. Reporting & Dashboard/09. Understanding the Stats Table.md'

echo '--- target file: show lines 120-170 (with line numbers via awk) ---'
sed -n '120,170p' "$FILE" | awk '{printf "LINE %d: %s\n", NR+119, $0}'

echo '--- repo search for field definitions/wording ---'
rg -n --hidden --no-ignore-vcs \
  '(Total messages|Messages received from contacts|Messages sent to contacts|system-generated|inbound|outbound|messages\s*=\s*inbound|messages.*inbound.*outbound)' \
  --type=md docs

Repository: glific/docs

Length of output: 4631


🏁 Script executed:

#!/bin/bash
set -euo pipefail

DOC1='docs/4. Product Features/09. Reporting & Dashboard/09. Understanding the Stats Table.md'
DOC2='docs/8. FAQ/Glific BigQuery Tables Guide.md'

echo '--- target file: show lines 50-110 ---'
sed -n '50,110p' "$DOC1" | awk '{print NR+49 ":" $0}'

echo '--- target file: show around Common Pitfalls line 139-150 ---'
sed -n '135,155p' "$DOC1" | awk '{print NR+134 ":" $0}'

echo '--- BigQuery guide: show around messages/inbound/outbound definitions (510-560) ---'
sed -n '510,560p' "$DOC2" | awk '{print NR+509 ":" $0}'

echo '--- search docs for "system-generated" ---'
rg -n --hidden --no-ignore-vcs 'system-generated' --type=md docs

Repository: glific/docs

Length of output: 6310


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n --hidden --no-ignore-vcs \
  '(system[- ]generated|system message|generated message|other direction|flow direction.*null|inbound \+ outbound|inbound.*outbound.*balance|roughly balance|messages.*inbound.*outbound)' \
  --type=md docs

Repository: glific/docs

Length of output: 517


Align messages vs inbound+outbound wording with the documented definition

In docs/4. Product Features/09. Reporting & Dashboard/09. Understanding the Stats Table.md, messages is already defined as “all message records (inbound + outbound)” for the time window, and the BigQuery guide defines the same relationship. The “(approximately)” + “system-generated messages” caveat at line 147 isn’t explained anywhere else in the docs—either state the relationship without “approximately” or add a concrete description of which message records cause divergence.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/4`. Product Features/09. Reporting & Dashboard/09. Understanding the
Stats Table.md at line 147, Update the sentence about messages so it's
unambiguous: either remove "(approximately)" and state "messages = inbound +
outbound" exactly, or keep the caveat and enumerate which records cause
divergence (e.g., system-generated notifications, automated bot messages,
delivery receipts/failed-send system events, or deduplicated webhook retries)
and explain how they are counted in the time window; update the line referencing
the fields "messages", "inbound", and "outbound" in the Understanding the Stats
Table doc to include the chosen wording and a short concrete example of a
system-generated record that would make totals differ.

Loading