Sessions Today GA4 data is not returning SUM as expected
scottqueen-bixal opened this issue · comments
Background
We have noticed that the values returned from requests that include metric: sessions
and data range today -> today
to not have the same SUM values that consumed from the same reqeusts in the UA client.
User Story
As a visitor to https://analytics.usa.gov/, I should find a bar graph populated with session data
Acceptance Criteria
- report returns sessions ~MIL values per hour
- report provides value objects that includes current day -> up to current hour -> through 11am (0-23)
Sampling level is automatic | today, top-pages-7-days | We can’t adjust precision in sampling, but we still get a metadata response when sampling has been performed.In some cases sampling is done while processing is still in progress, so !Golden. When this occurs GA4 buckets values into the “(other)” key. For reports like today, with date rage “today” -> “today” this makes a significant impact. By updating the report date range to start from “yesterday”, we get a much more accurate SUM value, but need to handle some data clean-up on FE rendering. |
---|
We moved to using last-48-hours.json
report to provide values for the sessions bar chart.
https://analytics-develop.app.cloud.gov/data/live/last-48-hours.json
this includes values yesterday, but is run during realtime.sh
so we get updated values every ~15min
This data is inconsistent,
At some point during the 24hour cycle that the cron runs, our last-48-hours
report also reverts to an automatic sampling value similar to the today
report. You can see the break in values clearly on the chart, these two high bars are the (other)
bucket.
~9:30am
it eventually returns, today it normalized around ~10:30 am est.
closing in favor of https://github.com/orgs/18F/projects/55/views/1?pane=issue&itemId=50153131
some additional conversation on this issue here https://gsa-tts.slack.com/archives/C05S1B327MH/p1705595516913689?thread_ts=1705595482.488289&cid=C05S1B327MH
this user puts it well, https://stackoverflow.com/a/76550342, and recommends avoiding the use of 'today' for report request