-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
own: index recent contributors for ownership signals #51188
Conversation
# Conflicts: # internal/database/schema.json # internal/database/schema.md # migrations/BUILD.bazel # migrations/frontend/squashed.sql
# Conflicts: # internal/database/recent_contribution_signal.go
"github.com/sourcegraph/sourcegraph/internal/types" | ||
) | ||
|
||
func Test_RecentContributorIndexFromGitserver(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is largely the same test from the store, but feeding a mock gitserver through to the DB. It effectively tests the entire flow, except the queue.
@@ -324,6 +324,63 @@ func (c *clientImplementor) CommitGraph(ctx context.Context, repo api.RepoName, | |||
return gitdomain.ParseCommitGraph(strings.Split(string(out), "\n")), nil | |||
} | |||
|
|||
func (c *clientImplementor) CommitLog(ctx context.Context, repo api.RepoName, after time.Time) ([]CommitLog, error) { | |||
args := []string{"log", "--pretty=format:%H<!>%ae<!>%an<!>%ad", "--name-only", "--topo-order", "--no-merges"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added no merges, does that match your thinking @cbart?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes total sense to me
Codenotify: Notifying subscribers in CODENOTIFY files for diff 931d6ad...afdd075.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I left some comments, I guess the main thing is to add some test coverage for a new gitserver command
} | ||
|
||
func (h *handler) Handle(ctx context.Context, logger log.Logger, record *Job) error { | ||
func (h *handler) Handle(ctx context.Context, lgr log.Logger, record *Job) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we have a bit of inconsistent logger naming, in structs it is logger
, in function parameters I can see lgr
, should we unify it?
I guess lgr
is to fix the prefix of logger
clashing with the imported log
package? At least this is what Goland usually complains to me about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, when you have local variables that shadow the package name. It's annoying because sometimes we have the import alias log
and other times logger
, so I've been trying something new with lgr
to always have a consistent import logger
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, makes sense
@@ -324,6 +324,63 @@ func (c *clientImplementor) CommitGraph(ctx context.Context, repo api.RepoName, | |||
return gitdomain.ParseCommitGraph(strings.Split(string(out), "\n")), nil | |||
} | |||
|
|||
func (c *clientImplementor) CommitLog(ctx context.Context, repo api.RepoName, after time.Time) ([]CommitLog, error) { | |||
args := []string{"log", "--pretty=format:%H<!>%ae<!>%an<!>%ad", "--name-only", "--topo-order", "--no-merges"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes total sense to me
@@ -324,6 +324,63 @@ func (c *clientImplementor) CommitGraph(ctx context.Context, repo api.RepoName, | |||
return gitdomain.ParseCommitGraph(strings.Split(string(out), "\n")), nil | |||
} | |||
|
|||
func (c *clientImplementor) CommitLog(ctx context.Context, repo api.RepoName, after time.Time) ([]CommitLog, error) { | |||
args := []string{"log", "--pretty=format:%H<!>%ae<!>%an<!>%ad", "--name-only", "--topo-order", "--no-merges"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drive by review
Have we considered benchmarking this function to see how this would perform for monorepos with 500k to 1M commits? Asking because we recently saw a bunch of customer issues where load was the primary bottleneck and it seems like this command can be pretty open ended and in the worst case list all the repo's commits?
I'm trying to anticipate problems that could occur in the future but if you all don't see an issue here - feel free to ignore.
Also the benchmarking (if it makes sense) is not a blocking review for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@indradhanush yep, definitely something we are going to test once this can go onto a customer-like instance. Most likely we will end up paging this anyway so we aren't always running the log from HEAD (this is how Code Insights works as well, we log from older commits to get reference hashes)
Co-authored-by: Alex Ostrikov <alex.ostrikov@sourcegraph.com>
Co-authored-by: Alex Ostrikov <alex.ostrikov@sourcegraph.com>
@@ -3206,3 +3208,54 @@ func TestBlameHunkReader(t *testing.T) { | |||
} | |||
}) | |||
} | |||
|
|||
func Test_CommitLog(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀🚀🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR hooks up the recent contributors indexer with the background job, effectively enabling indexing once the feature flag is switched on.
Test plan
You can start Sourcegraph and let the jobs auto-index, or you can manually control the queue through the
own_background_jobs
table.Here is some example output auto-indexing the local dev repos: