Skip to content

Conversation

@basbroek
Copy link
Contributor

@basbroek basbroek commented Sep 4, 2025

Problem

The numbers for Views, Visitors and Visits on the Events and Sessions page are different from those on the Overview page.

Root Cause

On the Events and Sessions page all website_event are counted, both pageViews and customEvents. So if in 1 session 1 user visits 1 page and triggers on that page 1 event, 2 views are counted. See image below.

Solution

Ignore customEvents when calculating pageviews, visitors and visits.

Example

umami-wrong-stats
@vercel
Copy link

vercel bot commented Sep 4, 2025

@basbroek is attempting to deploy a commit to the umami-software Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR fixes a data accuracy issue in the website statistics calculation by properly filtering event types when computing pageviews, visitors, visits, and countries metrics. The root problem was that the getWebsiteSessionStats function was counting all website events (both pageviews with event_type = 1 and custom events with event_type = 2) when calculating core metrics, leading to inflated numbers on the Events and Sessions pages compared to the Overview page.

The fix modifies the relational database query to use PostgreSQL's FILTER clause to distinguish between different event types. Now pageviews, visitors, visits, and countries are calculated only from actual page visits (event_type = 1), while custom events (event_type = 2) are counted separately. This ensures that a single user session with one page visit and one custom event correctly registers as 1 pageview instead of 2.

However, there's an inconsistency in the implementation - while the relational query (PostgreSQL/MySQL via Prisma) properly applies the filtering logic, the ClickHouse query implementation doesn't consistently apply the same filtering. The ClickHouse version uses sumIf(1, event_type = 1) for pageviews but uniq(session_id) without filtering for visitors, and uniq(visit_id) without filtering for visits, which may still exhibit the original double-counting issue.

Confidence score: 3/5

  • This PR partially addresses a legitimate data accuracy bug but has implementation inconsistencies
  • Score reflects that while the Prisma implementation is correct, the ClickHouse implementation may still have the original issue
  • Pay close attention to the ClickHouse query logic in lines 64-68 which doesn't consistently filter by event type

1 file reviewed, 1 comment

Edit Code Review Bot Settings | Greptile

@basbroek
Copy link
Contributor Author

basbroek commented Sep 4, 2025

I'm not able to check the ClickHouse implementation so please pay extra attention to those lines in review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant