Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update user definitions page with technical info on how we count users #3524

Merged
merged 7 commits into from
Jul 6, 2021
Next Next commit
Update user_definitions.md
  • Loading branch information
attfarhan authored Jul 1, 2021
commit 1cac9da6e43b1273e7efce9cbfcca913c844ec7e
20 changes: 20 additions & 0 deletions handbook/ops/bizops/user_definitions.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,23 @@ We track the following categories in pings for each month. The explanations are
|---------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| DAU/MAU | The ratio of average DAUs over a month to the number of MAUs in the corresponding month. If the ratio is 0.4 or 40%, the average user used Sourcegraph 12 days per month (30 days * .4 = 12). |
| DAU/WAU | The ratio of average DAUs over a week to the number of WAUs in the corresponding week. If the ratio is 0.4 or 40%, the average user used Sourcegraph 2.8 days per week (7 days * .4 = 2.8). |

# How are users calculated in the app?
attfarhan marked this conversation as resolved.
Show resolved Hide resolved


Our metrics infrastructure (Looker, Amplitude) gets user counts from our event_logs database.
attfarhan marked this conversation as resolved.
Show resolved Hide resolved

### On-Prem Instances
In each ping, instances will send a site_activity.DAUs, site_activity.WAUs, and site_activity.MAUs field which represent daily, weekly, and monthly user counts on an instance. These numbers are taken from the `event_logs` table of the instance, and count the number of distinct user IDs that executed any action on the instance in a given time period. See the [`activeUsers` function](https://sourcegraph.com/search?q=context:global+repo:%5Egithub%5C.com/sourcegraph/sourcegraph%24%407eeeb9b+func+activeUsers&patternType=literal) for the implementation and SQL query.
attfarhan marked this conversation as resolved.
Show resolved Hide resolved

### Sourcegraph Cloud

For Sourcegraph Cloud, we use the same method for calculating user counts, pulling from the Sourcegraph Cloud database. However, for Cloud, we track unauthed users using their `anonymous_user_id`. This is a separate column which contains an anonymous ID, which is stored in a cookie for users that visit Sourcegraph.com. Therefore, for all charts that track Cloud active users, this includes unauthenticated users.
attfarhan marked this conversation as resolved.
Show resolved Hide resolved

There are shortcomings to this. For one, when a user converts into an authed user, their events conducted with their anonymous user ID are still in the DB, so we would count two different users being active rather than a single user.For analytics purposes in Amplitude, this is also not ideal because we are not able to connect the actions of a user before and after they've converted.

### In-app site admin usage stas page
attfarhan marked this conversation as resolved.
Show resolved Hide resolved

This is a known issue. In the site admin panel, we have a Usage stats page that displays number of MAUs. This pulls data from Redis, which gets populated by our `usagestatsdeprecated` package. This was an old way of collecting data, and is not reliable. This has been raised, and will be fixed to use the `event_logs` table.
attfarhan marked this conversation as resolved.
Show resolved Hide resolved