Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(database): add support for schedules #834

Merged
merged 30 commits into from
May 18, 2023
Merged

Conversation

jbrockopp
Copy link
Contributor

@jbrockopp jbrockopp commented Apr 28, 2023

xref: go-vela/community#538

Part of go-vela/community#772

Dependent on go-vela/types#289

This adds a new schedule package to the github.com/go-vela/server/database package.

This contains a ScheduleInterface interface declaring all functions necessary for schedule based interactions with the database:

// functions with the supported Database backends.
//
//nolint:revive // ignore name stutter
type ScheduleInterface interface {
// Schedule Data Definition Language Functions
//
// https://en.wikipedia.org/wiki/Data_definition_language
// CreateScheduleIndexes defines a function that creates the indexes for the schedules table.
CreateScheduleIndexes() error
// CreateScheduleTable defines a function that creates the schedules table.
CreateScheduleTable(string) error
// Schedule Data Manipulation Language Functions
//
// https://en.wikipedia.org/wiki/Data_manipulation_language
// CountSchedules defines a function that gets the count of all schedules.
CountSchedules() (int64, error)
// CountSchedulesForRepo defines a function that gets the count of schedules by repo ID.
CountSchedulesForRepo(*library.Repo) (int64, error)
// CreateSchedule defines a function that creates a new schedule.
CreateSchedule(*library.Schedule) error
// DeleteSchedule defines a function that deletes an existing schedule.
DeleteSchedule(*library.Schedule) error
// GetSchedule defines a function that gets a schedule by ID.
GetSchedule(int64) (*library.Schedule, error)
// GetScheduleForRepo defines a function that gets a schedule by repo ID and number.
GetScheduleForRepo(*library.Repo, string) (*library.Schedule, error)
// ListActiveSchedules defines a function that gets a list of all active schedules.
ListActiveSchedules() ([]*library.Schedule, error)
// ListSchedules defines a function that gets a list of all schedules.
ListSchedules() ([]*library.Schedule, error)
// ListSchedulesForRepo defines a function that gets a list of schedules by repo ID.
ListSchedulesForRepo(*library.Repo, int, int) ([]*library.Schedule, int64, error)
// UpdateSchedule defines a function that updates an existing schedule.
UpdateSchedule(*library.Schedule) error
}

This package also contains the engine which implements the above schedule interface:

// engine represents the schedule functionality that implements the ScheduleInterface interface.
engine struct {
// engine configuration settings used in schedule functions
config *config
// gorm.io/gorm database client used in schedule functions
//
// https://pkg.go.dev/gorm.io/gorm#DB
client *gorm.DB
// sirupsen/logrus logger used in schedule functions
//
// https://pkg.go.dev/github.com/sirupsen/logrus#Entry
logger *logrus.Entry
}

This engine contains no raw SQL queries for integrating with the schedules table.

Instead, we leverage our DB library's (https://gorm.io/) agnostic abstraction for integrating with that table.

@jbrockopp jbrockopp added the feature Indicates a new feature label Apr 28, 2023
@jbrockopp jbrockopp self-assigned this Apr 28, 2023
database/schedule/create.go Show resolved Hide resolved
database/schedule/update.go Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Apr 28, 2023

Codecov Report

❗ No coverage uploaded for pull request base (main@61419f4). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head ff41fa7 differs from pull request most recent head 18e097e. Consider uploading reports for the commit 18e097e to get more accurate results

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #834   +/-   ##
=======================================
  Coverage        ?   59.48%           
=======================================
  Files           ?      284           
  Lines           ?    15798           
  Branches        ?        0           
=======================================
  Hits            ?     9397           
  Misses          ?     5958           
  Partials        ?      443           

@JordanSussman
Copy link
Collaborator

@jbrockopp can we mark this as Ready for review?

Copy link
Contributor

@ecrupper ecrupper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple random typos / nits as well as a question related to the count request for those list functions.

database/schedule/get_repo.go Outdated Show resolved Hide resolved
database/schedule/interface.go Outdated Show resolved Hide resolved
Comment on lines +22 to +26
// count the results
count, err := e.CountSchedules()
if err != nil {
return nil, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious if this is necessary. It doesn't look like we use it other than short circuiting an empty table, and the COUNT query is fairly taxing (at least in Postgres): https://wiki.postgresql.org/wiki/Introduction_to_VACUUM,_ANALYZE,_EXPLAIN,_and_COUNT#COUNT.28.2A.29

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To provide context, this was adopted from an existing pattern we already use for builds:

// count the results
count, err := c.GetOrgBuildCount(org, filters)
if err != nil {
return builds, 0, err
}
// short-circuit if there are no results
if count == 0 {
return builds, 0, nil
}
// calculate offset for pagination through results
offset := perPage * (page - 1)
// send query to the database and store result in variable
err = c.Postgres.
Table(constants.TableBuild).
Select("builds.*").
Joins("JOIN repos ON builds.repo_id = repos.id and repos.org = ?", org).
Where(filters).
Order("created DESC").
Order("id").
Limit(perPage).
Offset(offset).
Scan(b).Error

And this code follows the same pattern used in the other database functions to list resources e.g.

In general, this need for the count stems from usage for API pagination.

Before I started the DB refactor efforts, we still called the count for API but did so in a separate function i.e.

server/api/service.go

Lines 241 to 259 in 8c2fdb2

// send API call to capture the total number of services for the build
t, err := database.FromContext(c).GetBuildServiceCount(b)
if err != nil {
retErr := fmt.Errorf("unable to get services count for build %s: %w", entry, err)
util.HandleError(c, http.StatusInternalServerError, retErr)
return
}
// send API call to capture the list of services for the build
s, err := database.FromContext(c).GetBuildServiceList(b, page, perPage)
if err != nil {
retErr := fmt.Errorf("unable to get services for build %s: %w", entry, err)
util.HandleError(c, http.StatusInternalServerError, retErr)
return
}

I decided it was worth adopting the pattern setup for builds since it reduces the need to call them separately.

Copy link
Contributor Author

@jbrockopp jbrockopp May 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I thought it was worth highlighting a statement from the document you linked:

- turns out I misread the article.

In general, calling count(*) for a specific table isn't very taxing.

Could you elaborate on your concern?

Are you facing issues with this for other tables and that's why you're mentioning it here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, I don't really have any issues with the implementation, nor are we experiencing performance problems. Though, COUNT queries across a specific table without any WHERE clause related to an index are relatively slow (or slower than you'd expect is probably a better descriptor): https://wiki.postgresql.org/wiki/Slow_Counting.

I'm fine moving this conversation elsewhere as a potential reconsideration of design since it's endemic across the code base, as you pointed out 👍

Just trying to give all my thoughts so I don't forget 😃

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure 👍 perhaps it's worth opening an issue for this so it doesn't get forgotten?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

database/schedule/list_active_test.go Show resolved Hide resolved
database/schedule/list_repo.go Show resolved Hide resolved
@jbrockopp jbrockopp marked this pull request as ready for review May 16, 2023 19:12
@jbrockopp jbrockopp requested a review from a team as a code owner May 16, 2023 19:12
Copy link
Contributor

@ecrupper ecrupper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ecrupper ecrupper merged commit d0f63b1 into main May 18, 2023
@ecrupper ecrupper deleted the feature/schedule/database branch May 18, 2023 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Indicates a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants