Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include appropriate complete_platforms for serverless/FaaS environments runtime #18195

Closed
huonw opened this issue Feb 7, 2023 · 5 comments · Fixed by #19253 or #21248
Closed

Include appropriate complete_platforms for serverless/FaaS environments runtime #18195

huonw opened this issue Feb 7, 2023 · 5 comments · Fixed by #19253 or #21248
Labels
backend: Python Python backend-related issues enhancement

Comments

@huonw
Copy link
Contributor

huonw commented Feb 7, 2023

Is your feature request related to a problem? Please describe.

Currently building Serverless artefacts like AWS Lambda or GCF supports both a quick-and-easy runtime setting, and a complete_platforms. Generally setting complete_platforms is required to get the right wheels, and setting runtime is not useful/misleading/causes problems (e.g. #15296, #18001, and a continual stream of questions in Slack).

Describe the solution you'd like

I think the environments for a given runtime are generally pretty stable, so pants itself could potentially include an appropriate complete platform JSON for each runtime setting (so 3 for AWS Lambda, and 4 for GCF). That is, something like python_awslambda(..., runtime="python39") would be translated into calling pex with a pants-provided complete platform JSON, rather than just a simple platform identifier.

Having pants provide a complete platform almost certainly won't be worse than not using a complete platform at all, and would hopefully mean fewer users have problems. Any users who still have problems can provide their own complete platform.

Downside: if the complete platform pants provides is changed (e.g. pants 2.35 has a new JSON than pants 2.34), built artefacts may change after a user upgrades to 2.35, due to differing wheel selection.

Describe alternatives you've considered

Just not doing this, because the downsides/risks are too large?

Additional context

The complete platforms can be generated by running PEX in each of the relevant environments. For instance, we're deploying AWS Lambdas using a complete platform generated by:

import subprocess

def lambda_handler(event, context):
    subprocess.run(
        """
        pip install --target=/tmp/subdir pex
        PYTHONPATH=/tmp/subdir /tmp/subdir/bin/pex3 interpreter inspect --markers --tags
        """,
        shell=True
    )
    return {
        'statusCode': 200,
        'body': "{}",
    }

I think this would "just" mean switching the following bit of code to select an appropriate pants-provided JSON file, rather than just constructing the platform string:

# We hardcode the platform value to the appropriate one for each AWS Lambda runtime.
# (Running the "hello world" lambda in the example code will report the platform, and can be
# used to verify correctness of these platform strings.)
pex_platforms = []
interpreter_version = field_set.runtime.to_interpreter_version()
if interpreter_version:
py_major, py_minor = interpreter_version
platform_str = f"linux_x86_64-cp-{py_major}{py_minor}-cp{py_major}{py_minor}"
# set pymalloc ABI flag - this was removed in python 3.8 https://bugs.python.org/issue36707
if py_major <= 3 and py_minor < 8:
platform_str += "m"
if (py_major, py_minor) == (2, 7):
platform_str += "u"
pex_platforms.append(platform_str)

@ShantanuKumar ShantanuKumar added the backend: Python Python backend-related issues label Feb 14, 2023
huonw added a commit that referenced this issue May 20, 2023
This fixes #18879 by allowing the `python_awslambda` and
`python_google_cloud_function` FaaS artefacts to be generated in
"simple" format, using the `pex3 venv create --layout=flat-zipped`
functionality recently added in PEX 2.1.135
(https://github.com/pantsbuild/pex/releases/tag/v2.1.135). This format
is just: put everything at the top-level, e.g. the zip contains
`cowsay/__init__.py` etc., rather than `.deps/cowsay-....whl` (plus the
dynamic PEX initialisation).

This shifts the dynamic dependency computation/extraction/layout from
run-time to build-time, relying on the FaaS environment to be generally
consistent. It shouldn't change what actually happens after
initialisation. This can:

- reduce cold-starts noticeably: for instance, some of our lambdas spend
1s doing PEX/Lambdex start up.
- reduce package size somewhat (the PEX `.bootstrap/` folder seems to be
about 2MB uncompressed, ~1MB compressed).
- increase build times.
 
For instance, for one Python 3.9 Lambda in our codebase:

| metric                        | before   | after            |
|-------------------------------|----------|------------------|
| init time on cold start       | 2.3-2.5s | 1.3-1.4s (-1s)   |
| compressed size               | 24.6MB   | 23.8MB (-0.8MB)  |
| uncompressed size             | 117.8MB  | 115.8MB (-2.0MB) |
| PEX-construction build time   | ~5s      | ~5s              |
| PEX-postprocessing build time | 0.14s    | 4.8s             |

(The PEX-postprocessing time metric is specifically the time to run the
`Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv
create`) process, computed by running `pants --keep-sandboxes=always
package ...` for each layout, and then `hyperfine -r3 -w1
path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include
the time to construct the input PEX, which is the same for both.)

This functionality is driven by adding a new `layout` field. It defaults
to `lambdex` (retaining the current code paths), but also supports
`zip`, which keys into the functionality above. I've tried to keep the
non-lambdex implementation generally separate to the lambdex one, rather
than reusing all of the code that happens to be common currently,
because it'd make sense to deprecate/remove the lambdex functionality
and thus I feel it's best for this new functionality to be mostly a
fresh start.

This PR's commits can be reviewed independently. It comes in three
phases:

1. Add the `pex_venv.py` util rules for running `pex3 venv create ...`.
Currently this only supports a limited subset of the functionality
there, but can presumably be expanded freely as required. (First commit)
2. Do some minor refactoring. (Commits labelled "refactor: ...")
3. Draw the rest of the owl. (The others.)

I _think_ this is an acceptable MVP for this functionality, but there's
various bits of follow-up:

- deprecate `layout="lambdex"` (in favour of `layout="zip"` and/or
normal `pex_binary`) (#19032)
- add a warning about `files` being loaded into these packages, which
has been temporarily lost (#19027)
- adjust documentation
- other improvements like #18195 and #18880 
- improve performance, e.g. potentially `pex3 venv create ...` could use
the lock file and sources to directly compute the appropriate files,
without having to materialise a normal pex first
thejcannon pushed a commit that referenced this issue May 23, 2023
This fixes #18879 by allowing the `python_awslambda` and
`python_google_cloud_function` FaaS artefacts to be generated in
"simple" format, using the `pex3 venv create --layout=flat-zipped`
functionality recently added in PEX 2.1.135
(https://github.com/pantsbuild/pex/releases/tag/v2.1.135).

This format is just: put everything at the top-level. For instance, the
zip contains `cowsay/__init__.py` etc., rather than
`.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX
initialisation/venv creation.

This shifts the dynamic dependency computation/extraction/layout from
run-time to build-time, relying on the FaaS environment to be generally
consistent. It shouldn't change what actually happens after
initialisation. This can:

- reduce cold-starts noticeably: for instance, some of our lambdas spend
1s doing PEX/Lambdex start up.
- reduce package size somewhat (the PEX `.bootstrap/` folder seems to be
about 2MB uncompressed, ~1MB compressed).
- increase build times.
 
For instance, for one Python 3.9 Lambda in our codebase:

| metric | before | after |
|---|---|---|
| init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) |
| compressed size |  24.6MB | 23.8MB (-0.8MB) |
| uncompressed size | 117.8MB | 115.8MB (-2.0MB) |
| PEX-construction build time | ~5s | ~5s |
| PEX-postprocessing build time | 0.14s | 4.8s |

(The PEX-postprocessing time metric is specifically the time to run the
`Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv
create`) process, computed by running `pants --keep-sandboxes=always
package ...` for each layout, and then `hyperfine -r3 -w1
path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include
the time to construct the input PEX, which is the same for both.)

---

This functionality is driven by adding a new option to the
`[lambdex].layout` option added in #19074. In #19074 (targeted for
2.17), it defaults `lambdex` (retaining the current code paths). This PR
flips the default to the new option `zip`, which keys into the
functionality above. I've tried to keep the non-lambdex implementation
generally separate to the lambdex one, rather than reusing all of the
code that happens to be common currently, because it'd make sense to
deprecate/remove the lambdex functionality and thus I feel it's best for
this new functionality to be mostly a fresh start.

This PR's commits can be reviewed independently. 

I _think_ this is an acceptable MVP for this functionality, but there's
various bits of follow-up:

- add a warning about `files` being loaded into these packages, which
has been temporarily lost (#19027)
- adjust documentation #19067
- other improvements like #18195 and #18880 
- improve performance, e.g. potentially `pex3 venv create ...` could use
the lock file and sources to directly compute the appropriate files,
without having to materialise a normal pex first

This is a re-doing of #19022 with a simpler approach to deprecation, as
discussed in
#19074 (comment)
and
#19032 (comment).
The phasing will be:

| release | supports lambdex? | supports zip? | default layout |
deprecation warnings |
|---|---|---|---|---|
| 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is
implicit, tell people to set it: recommend `zip`, but allow `lambdex` if
they have to |
| 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell
people to remove it and switch to `zip` |
| 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about
removing the `[lambdex]` section entirely) |
WorkerPants pushed a commit that referenced this issue May 23, 2023
This fixes #18879 by allowing the `python_awslambda` and
`python_google_cloud_function` FaaS artefacts to be generated in
"simple" format, using the `pex3 venv create --layout=flat-zipped`
functionality recently added in PEX 2.1.135
(https://github.com/pantsbuild/pex/releases/tag/v2.1.135).

This format is just: put everything at the top-level. For instance, the
zip contains `cowsay/__init__.py` etc., rather than
`.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX
initialisation/venv creation.

This shifts the dynamic dependency computation/extraction/layout from
run-time to build-time, relying on the FaaS environment to be generally
consistent. It shouldn't change what actually happens after
initialisation. This can:

- reduce cold-starts noticeably: for instance, some of our lambdas spend
1s doing PEX/Lambdex start up.
- reduce package size somewhat (the PEX `.bootstrap/` folder seems to be
about 2MB uncompressed, ~1MB compressed).
- increase build times.
 
For instance, for one Python 3.9 Lambda in our codebase:

| metric | before | after |
|---|---|---|
| init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) |
| compressed size |  24.6MB | 23.8MB (-0.8MB) |
| uncompressed size | 117.8MB | 115.8MB (-2.0MB) |
| PEX-construction build time | ~5s | ~5s |
| PEX-postprocessing build time | 0.14s | 4.8s |

(The PEX-postprocessing time metric is specifically the time to run the
`Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv
create`) process, computed by running `pants --keep-sandboxes=always
package ...` for each layout, and then `hyperfine -r3 -w1
path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include
the time to construct the input PEX, which is the same for both.)

---

This functionality is driven by adding a new option to the
`[lambdex].layout` option added in #19074. In #19074 (targeted for
2.17), it defaults `lambdex` (retaining the current code paths). This PR
flips the default to the new option `zip`, which keys into the
functionality above. I've tried to keep the non-lambdex implementation
generally separate to the lambdex one, rather than reusing all of the
code that happens to be common currently, because it'd make sense to
deprecate/remove the lambdex functionality and thus I feel it's best for
this new functionality to be mostly a fresh start.

This PR's commits can be reviewed independently. 

I _think_ this is an acceptable MVP for this functionality, but there's
various bits of follow-up:

- add a warning about `files` being loaded into these packages, which
has been temporarily lost (#19027)
- adjust documentation #19067
- other improvements like #18195 and #18880 
- improve performance, e.g. potentially `pex3 venv create ...` could use
the lock file and sources to directly compute the appropriate files,
without having to materialise a normal pex first

This is a re-doing of #19022 with a simpler approach to deprecation, as
discussed in
#19074 (comment)
and
#19032 (comment).
The phasing will be:

| release | supports lambdex? | supports zip? | default layout |
deprecation warnings |
|---|---|---|---|---|
| 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is
implicit, tell people to set it: recommend `zip`, but allow `lambdex` if
they have to |
| 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell
people to remove it and switch to `zip` |
| 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about
removing the `[lambdex]` section entirely) |
huonw added a commit to huonw/pants that referenced this issue May 23, 2023
…d#19076)

This fixes pantsbuild#18879 by allowing the `python_awslambda` and
`python_google_cloud_function` FaaS artefacts to be generated in
"simple" format, using the `pex3 venv create --layout=flat-zipped`
functionality recently added in PEX 2.1.135
(https://github.com/pantsbuild/pex/releases/tag/v2.1.135).

This format is just: put everything at the top-level. For instance, the
zip contains `cowsay/__init__.py` etc., rather than
`.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX
initialisation/venv creation.

This shifts the dynamic dependency computation/extraction/layout from
run-time to build-time, relying on the FaaS environment to be generally
consistent. It shouldn't change what actually happens after
initialisation. This can:

- reduce cold-starts noticeably: for instance, some of our lambdas spend
1s doing PEX/Lambdex start up.
- reduce package size somewhat (the PEX `.bootstrap/` folder seems to be
about 2MB uncompressed, ~1MB compressed).
- increase build times.
 
For instance, for one Python 3.9 Lambda in our codebase:

| metric | before | after |
|---|---|---|
| init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) |
| compressed size |  24.6MB | 23.8MB (-0.8MB) |
| uncompressed size | 117.8MB | 115.8MB (-2.0MB) |
| PEX-construction build time | ~5s | ~5s |
| PEX-postprocessing build time | 0.14s | 4.8s |

(The PEX-postprocessing time metric is specifically the time to run the
`Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv
create`) process, computed by running `pants --keep-sandboxes=always
package ...` for each layout, and then `hyperfine -r3 -w1
path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include
the time to construct the input PEX, which is the same for both.)

---

This functionality is driven by adding a new option to the
`[lambdex].layout` option added in pantsbuild#19074. In pantsbuild#19074 (targeted for
2.17), it defaults `lambdex` (retaining the current code paths). This PR
flips the default to the new option `zip`, which keys into the
functionality above. I've tried to keep the non-lambdex implementation
generally separate to the lambdex one, rather than reusing all of the
code that happens to be common currently, because it'd make sense to
deprecate/remove the lambdex functionality and thus I feel it's best for
this new functionality to be mostly a fresh start.

This PR's commits can be reviewed independently. 

I _think_ this is an acceptable MVP for this functionality, but there's
various bits of follow-up:

- add a warning about `files` being loaded into these packages, which
has been temporarily lost (pantsbuild#19027)
- adjust documentation pantsbuild#19067
- other improvements like pantsbuild#18195 and pantsbuild#18880 
- improve performance, e.g. potentially `pex3 venv create ...` could use
the lock file and sources to directly compute the appropriate files,
without having to materialise a normal pex first

This is a re-doing of pantsbuild#19022 with a simpler approach to deprecation, as
discussed in
pantsbuild#19074 (comment)
and
pantsbuild#19032 (comment).
The phasing will be:

| release | supports lambdex? | supports zip? | default layout |
deprecation warnings |
|---|---|---|---|---|
| 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is
implicit, tell people to set it: recommend `zip`, but allow `lambdex` if
they have to |
| 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell
people to remove it and switch to `zip` |
| 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about
removing the `[lambdex]` section entirely) |
huonw added a commit that referenced this issue May 23, 2023
…ck of #19076) (#19120)

This fixes #18879 by allowing the `python_awslambda` and
`python_google_cloud_function` FaaS artefacts to be generated in
"simple" format, using the `pex3 venv create --layout=flat-zipped`
functionality recently added in PEX 2.1.135
(https://github.com/pantsbuild/pex/releases/tag/v2.1.135).

This format is just: put everything at the top-level. For instance, the
zip contains `cowsay/__init__.py` etc., rather than
`.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX
initialisation/venv creation.

This shifts the dynamic dependency computation/extraction/layout from
run-time to build-time, relying on the FaaS environment to be generally
consistent. It shouldn't change what actually happens after
initialisation. This can:

- reduce cold-starts noticeably: for instance, some of our lambdas spend
1s doing PEX/Lambdex start up.
- reduce package size somewhat (the PEX `.bootstrap/` folder seems to be
about 2MB uncompressed, ~1MB compressed).
- increase build times.
 
For instance, for one Python 3.9 Lambda in our codebase:

| metric | before | after |
|---|---|---|
| init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) |
| compressed size |  24.6MB | 23.8MB (-0.8MB) |
| uncompressed size | 117.8MB | 115.8MB (-2.0MB) |
| PEX-construction build time | ~5s | ~5s |
| PEX-postprocessing build time | 0.14s | 4.8s |

(The PEX-postprocessing time metric is specifically the time to run the
`Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv
create`) process, computed by running `pants --keep-sandboxes=always
package ...` for each layout, and then `hyperfine -r3 -w1
path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include
the time to construct the input PEX, which is the same for both.)

---

This functionality is driven by adding a new option to the
`[lambdex].layout` option added in #19074. In #19074 (targeted for
2.17), it defaults `lambdex` (retaining the current code paths). This PR
flips the default to the new option `zip`, which keys into the
functionality above. I've tried to keep the non-lambdex implementation
generally separate to the lambdex one, rather than reusing all of the
code that happens to be common currently, because it'd make sense to
deprecate/remove the lambdex functionality and thus I feel it's best for
this new functionality to be mostly a fresh start.

This PR's commits can be reviewed independently. 

I _think_ this is an acceptable MVP for this functionality, but there's
various bits of follow-up:

- add a warning about `files` being loaded into these packages, which
has been temporarily lost (#19027)
- adjust documentation #19067
- other improvements like #18195 and #18880 
- improve performance, e.g. potentially `pex3 venv create ...` could use
the lock file and sources to directly compute the appropriate files,
without having to materialise a normal pex first

This is a re-doing of #19022 with a simpler approach to deprecation, as
discussed in
#19074 (comment)
and
#19032 (comment).
The phasing will be:

| release | supports lambdex? | supports zip? | default layout | deprecation warnings |
|---|---|---|---|---|
| 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is implicit, tell people to set it: recommend `zip`, but allow `lambdex` if they have to |
| 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell people to remove it and switch to `zip` |
| 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about removing the `[lambdex]` section entirely) |
@huonw huonw self-assigned this Jun 4, 2023
huonw added a commit that referenced this issue Jun 13, 2023
…ilding (#19253)

This changes the behaviour of the `runtime` field of the
`python_aws_lambda_function`, `python_aws_lambda_layer` and
`python_google_cloud_function` FaaS targets to be translated into a
Pants-provided complete platform (if known), rather than using the less
specific `--platform` argument.

For example, when building a target like
`python_aws_lambda_function(..., runtime="python3.10")`, Pants will pull
out an appropriate complete platform resource and use that when building
the Lambda function, to choose the correct wheels.

This is motivated by:

1. the FaaS runtimes generally being stable (AIUI), so it's actually
reasonable/possible to provide this.
2. the naive wheel selection for `runtime`/`--platform` often fails and
users have to switch to using complete platforms themselves, and, one
can still use `complete_platforms` manually, so this is not _worse_.
3. the switch to `layout = "zip"` means that passing both
`runtime="..."` and `complete_platforms=[...]` will fail, as only one
`--platform=...` or `--complete-platform=...` argument can be passed to
the underlying PEX call.

The moving parts are:

1. a script `build-support/bin/generate_faas_complete_platforms.py` that
reads information about each FaaS system, including runtime docker repo
and the tags for each 'known runtime', and shells out to docker to extra
a complete platform JSON file
2. if `complete_platforms` isn't specified, convert the `runtime` a
complete platform argument (if it's known), or continue falling back to
the naive `--platform` style

Notes: 

- Notable failing: this is only implemented for AWS: I cannot find an
equivalent docker repo to https://gallery.ecr.aws/lambda/python for GCF,
so despite being generic infrastructure, there's no 'known' runtimes for
GCF.
  - Does anyone know of docker images that simulate the runtime
    environment of GCF?
  - (I've found https://github.com/GoogleCloudPlatform/python-runtime but
    it's for AppEngine and is quite old, and
    https://cloud.google.com/functions/docs/concepts/python-runtime is very
    cagey about Python patch versions, let alone GLibc/manylinux tags!)
- This changes the interaction between the `runtime` and
`complete_platforms` field:
  - passing both is deprecated, and the behaviour changes too (`runtime`
    is now totally ignored, but, per point 3 of the motivation above, this
    would fail anyway)
  - passing `complete_platforms=[]` is now permitted, and, *I hope*, means
    the PEX calls will use the ambient interpreter, e.g. if running in a
    compatible docker image
- I haven't written explicit tests for the inference behaviour yet, as I
  didn't want to go too deep down that path before validating that this is
  sensible
- I've only implemented this for `layout = "zip"`, as `layout =
  "lambdex"` will be removed soon.
- (Once we have this, I think we can even do one step better: infer an
  appropriate runtime from interpreter constraints (e.g. if a repo has
  constraints that cover only one major version, like `==3.10.*`, infer
  `runtime="python3.10"`), so that a minimal FaaS target can be
  `python_aws_lambda_function(name="foo", handler="./foo.py:handler")`.)

Fixes #18195
@huonw huonw reopened this Jun 13, 2023
@huonw
Copy link
Contributor Author

huonw commented Jun 13, 2023

This was done for AWS Lambda in #19253, but not GCF.

huonw added a commit that referenced this issue Jun 16, 2023
…9314)

This allows the `runtime` argument to `python_aws_lambda_function`,
`python_aws_lambda_layer` and `python_google_cloud_function` to be
inferred from the relevant interpreter constraints, when they cover only
one major/minor version. For instance, having `==3.9.*` will infer
`runtime="python3.9"` for AWS Lambda.

The inference is powered by checking for two patterns of interpreter
constraints that limit to a single major version: equality `==3.9.*`
(implies 3.9) and range `>=3.10,<3.11` (implies 3.10). This inference
doesn't always work: when it doesn't work, the user gets an error
message to clarify by providing the `runtime` field explicitly. Failure
cases:

- if the interpreter constraints are too wide (for instance,
`>=3.7,<3.9` covering 2 versions, or `>=3.11` that'll eventually include
many versions), we can't be sure which is meant

- if the interpreter constraints limit the patch versions (for instance,
`==3.8.9` matching a specific version, or `==3.9.*,!=3.9.10` excluding
one), we can't be sure the cloud environment runs that version, so
inferring the runtime would be misleading

- if the interpreter constraints are non-obvious (for instance,
`>=3.7,<3.10,!=3.9.*` is technically 3.8 only), we don't try _too_ hard
to handle it. We can expand the inference if required in future.

For instance, if one has set `[python].interpreter_constraints =
["==3.9.*"]` in `pants.toml`, one can build a lambda artefact like (and
similarly for a GCF artifact):

```python
python_sources()
python_aws_lambda_function(name="func", entry_point="./foo.py:handler")
```

This is the final piece* of my work to improve the FaaS backends in
Pants 2.18:

- using the simpler "zip" layout as recommended by AWS and GCF,
deprecating Lambdex (#18879)
- support for AWS Lambda layers (#18880)
- Pants-provided complete platforms JSON files* when specifying a known
`runtime` (#18195)
- this PR, inferring the `runtime` from ICs, when unambiguous (including
using the new Pants-provided complete platform when available) (#19304)

(* The fixed complete platform files are currently only provided for AWS
Lambda, not GCF. #18195.)

The commits are individually reviewable.

Fixes #19304
@huonw
Copy link
Contributor Author

huonw commented Aug 3, 2023

This was done for AWS by running an appropriate pex3 ... command (see #19253) in the docker images AWS provides. It seems like GCF doesn't provide such images, so it wasn't easy to do that too.

The current thinking is run the pex3 ... command in the real environment, e.g. deploy a function that calls subprocess.run. I'm personally unsure how to do this yet (never used GCP yet!), so haven't invested the time.

@ryaminal
Copy link
Contributor

ryaminal commented Aug 31, 2023

decided to run this on the 4 python versions that are available in GCF v2(based on top of cloud run). not certain if it's different for v1, but i can get those also.
gcf_v2_38
gcf_v2_39
gcf_v2_310
gcf_V2_311

no idea if all 4 were needed but... yeah

@huonw
Copy link
Contributor Author

huonw commented Sep 1, 2023

That's great @ryaminal! More is better! That's great. I'm on leave for a few days so I can't offer much assistance just now, but if you were make a PR adding them to the repo, that'd be cool.

For v1 vs v2, I have no idea. Do you have a sense for how often people use v1?

@huonw huonw removed their assignment Sep 18, 2023
@benjyw benjyw assigned benjyw and unassigned benjyw Oct 4, 2023
@huonw huonw changed the title Include appropriate complete_platforms for serverless environments runtime Include appropriate complete_platforms for serverless/FaaS environments runtime Apr 25, 2024
huonw pushed a commit that referenced this issue Aug 5, 2024
…on runtime (#21248)

Closes #18195, closes #20515.

This adds `complete_platforms` for all Google Cloud Functions runtimes
and allows automatic selection of those platforms based on the provided
runtime. Additionally, the `generate_faas_complete_platforms.py` has
been updated to support generating `complete_platforms` JSON files for
GCF.

One complication while building this is that Google [publishes the
runtime image for each version of Python to a separate Docker
repository](https://cloud.google.com/functions/docs/concepts/execution-environment#python),
so the `known_runtime_docker_repo` abstraction that was used before
doesn't work for this case. The way I solved this is by pushing the
docker repo field down into `PythonFaaSKnownRuntime` instead, and
letting this field be customizable per-runtime.

Additionally, the tags used by GCF look like this:
`python37_20240728_3_7_17_RC00`. This follows the template of
`{runtime}_{date}_{python_version}_{signifier}`. This unfortunately
makes it very hard to find the "latest" version of the GCF image for a
given runtime, so I had to go into Google Container Registry to find the
latest tagged versions for each Python version.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: Python Python backend-related issues enhancement
Projects
None yet
4 participants