Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controlling what gets uploaded to Lambda zip file #35

Closed
lukehoban opened this issue Sep 28, 2017 · 18 comments
Closed

Controlling what gets uploaded to Lambda zip file #35

lukehoban opened this issue Sep 28, 2017 · 18 comments
Assignees
Labels
impact/performance Something is slower than expected impact/reliability Something that feels unreliable or flaky
Milestone

Comments

@lukehoban
Copy link
Member

With #34 we now upload the entire source folder as part of the Lambda zip. Previously, we uploaded the entire node_modules folder.

It is in general necessary to have the full source folder to support arbitrary require('./foo') in code running "on the inside".

However, for many/most Lambdas, they will not use any of the files in the root folder or even the node_modules on the inside. So we are carrying these along even though they won't be used.

Additionally, in local workflows, the working directory may contain source files or sensitive data that should not be uploaded into the runtime environment.

We could solve this with a configuration setting (possibly in Pulumi.yaml) along the lines of https://serverless.com/framework/docs/providers/aws/guide/packaging/.

@joeduffy joeduffy added this to the 0.8 milestone Oct 10, 2017
@CyrusNajmabadi
Copy link
Contributor

Do we think it's ok to include everything by default, and then have user exclude things they don't want uploaded? Or should all inclusion be explicit?

@joeduffy
Copy link
Member

@lukehoban
Copy link
Member Author

Related to those, but technically different. There’s two phases:

  1. What gets uploaded from dev machine to deployment service
  2. What gets uploaded from deployment service to Lambda/container

This is about the second. It may be we decide to offer one mechanism shared across both, but we may not.

I’d be inclined to upload everything by default, and let folks remove. But we also should look at Serverless and others for inspiration since they’ve gone through this process already.

@lukehoban lukehoban modified the milestones: 0.8, 0.9 Oct 18, 2017
@lukehoban
Copy link
Member Author

lukehoban commented Oct 26, 2017

One additional related issue is native modules. When development is done on Mac/Windows, these will not be valid ELF binaries and although they will be uploaded to the Lambda, they will not work.

This is a major known issue with Lambda, and we don't exactly make it worse, but we don't help (and we abstract away that it's Lambda, so it's not as easy for us to make the same excuses as Lambda does).

There are a few options:

  1. If our PPC took only source and ran "npm install" itself - then at least when using a PPC this would work (by us taking on the burden of being the CI)
  2. For local development, instead of directly uploading the folder, we could copy it (without node_modules), and then run npm install inside a Docker container. This would most likely introduce as many or more problems than it solves, but is probably the only way to get this working locally. See https://medium.com/tomincode/serverless-and-npm-native-modules-85cd9a648458 for example.

@ericrudder
Copy link
Member

ericrudder commented Oct 26, 2017 via email

@lindydonna
Copy link

@ericrudder We could start a wiki with "known issues" and link to GitHub issues. Later, when we want to release, we can see which issues are still open.

@mmdriley
Copy link
Contributor

mmdriley commented Nov 2, 2017

This has been a huge pain point in the Learning Machine bringup because any change to the source tree invalidates all Lambda resources in the project.

@joeduffy
Copy link
Member

joeduffy commented Nov 3, 2017

I second @mmdriley's feedback. This adds about a minute of overhead to each LM publish, even when the lambdas weren't essential and certainly didn't change. Sadly, I have few suggestions on what to do here, because it's JavaScript and could require just about anything, other than requiring a developer to explicitly list what to include (or at least giving them the option to prune).

It'd be good to look at Up and Apex to see if they have any clever solution for this.

@lindydonna
Copy link

I like Luke's suggestion of having an exclude/include file similar to Serverless Framework: https://serverless.com/framework/docs/providers/aws/guide/packaging/

@lukehoban
Copy link
Member Author

Up uses an .upignore: https://github.com/apex/up/blob/13108baf78dfe7238794e60dda5ea586ee9e672f/docs/04-configuration.md#ignoring-files and https://github.com/apex/up/blob/master/internal/zip/zip.go#L40-L45

So pretty similar to Serverless Framework.

I'm pretty convinced that is the right answer here too.

However, we have a challenge that there are two steps, both of which will eventually want a .pulumiignore:

  1. User machine -> PPC
  2. PPC -> Lambda

The use cases we're discussing are about (2).

I suggest we implement a .pulumiignore now to address (2). We can hold off longer on addressing (1) for pulumi/pulumi-service#122 and pulumi/pulumi-service#22 - and will presumably need to choose a different name for that.

@lindydonna
Copy link

Do we currently do any kind of webpack/browserify step before uploading the code?

@lukehoban
Copy link
Member Author

Do we currently do any kind of webpack/browserify step before uploading the code?

No. Those almost always change semantics of require application-specific knowledge/configuration.

[from @joeduffy in private chat] I can imagine wanting to customize per lambda in some cases. For instance, perhaps I have one lambda that does indeed need the full sources in my dir, and 100 lambdas that do not (and so I want to avoid superfluously updating them). It's possible we tell people to just stick them into different subdirs.

I can imagine that too - but I think the 90% case of the pain here is assets which have no business going to Lambda at all. I think we should fix that. Neither Serverless Framework nor Up have yet addressed the customer-per-lambda case either.

Notably - the soltuion to per-lambda customization is probably going to be the same here as for other reasons we might want to customize it: pulumi/pulumi-cloud#168

I think in a model where we allow an explicit cloud.Function(() => {}, { /* args */}) model, we can have one of the args be the specific files to include/exclude.

@lindydonna
Copy link

Given that we're already transforming the user's code, why not have an option to do a webpack? In Azure Functions, there's a preview tool that does this and it worked seamlessly for most customers. It had a big improvement on function size.

Agreed about the per-lambda customization case, which we can punt for now.

@lukehoban
Copy link
Member Author

Given that we're already transforming the user's code, why not have an option to do a webpack?

Perhaps - depends on whether it's something we could reliably do on the users behalf. Do you have pointers on this? Or suggestions on how we could add something similar to our workflow (which is not quite 1:1 with direct Azure Functions user workflow).

@lindydonna
Copy link

lindydonna commented Nov 4, 2017

@lukehoban There's probably a lot more complexity here that I don't entirely understand. :) I assumed we could do something similar to this Serverless Framework plugin: https://www.npmjs.com/package/serverless-webpack. It looks like there's some customization involved, so maybe that's a solution.

@joeduffy
Copy link
Member

joeduffy commented Nov 4, 2017

/cc @ellismg

There's definitely a deeper topic brewing here, which is whether we "build" code on behalf of the user and, if so, what extensibility mechanisms we offer. As we make progress on the service, I think we're going to have to do increasingly more of this, as @ellismg and I have discussed before.

In fact, our current cloud.Service Docker builds are arguably a perfect example of this already in action.

I do like, however, that architecturally these capabilities are layered distinctly. On one hand, the Docker builds bug me because they are the sole instance where we've violated this. On the other hand, I love it, because of its beautiful simplicity.

serverless-webpack is definitely very interesting to dig into. Interestingly, one of the first questions from the guy I spoke to today was whether we support webpack for lambdas.

@lindydonna
Copy link

FYI, it looks like Serverless Framework does have a feature for packing Lambdas individually, via the individually setting: https://serverless.com/framework/docs/providers/aws/guide/packaging#packaging-functions-separately

The serverless-webpack plugin recommends using that setting when using webpack (for obvious reasons).

@joeduffy
Copy link
Member

Also /cc'ing @ellismg , since this ties in closely with https://github.com/pulumi/pulumi-service/issues/22

@joeduffy joeduffy modified the milestones: 0.9, 0.10 Dec 1, 2017
@joeduffy joeduffy added impact/performance Something is slower than expected impact/reliability Something that feels unreliable or flaky labels Dec 5, 2017
@ellismg ellismg modified the milestones: 0.10, 0.11 Jan 10, 2018
@ellismg ellismg modified the milestones: 0.11, 0.12 Feb 11, 2018
@lukehoban lukehoban assigned lukehoban and unassigned CyrusNajmabadi Apr 2, 2018
@lukehoban lukehoban modified the milestones: 0.12, 0.14 Apr 20, 2018
lukehoban added a commit that referenced this issue May 13, 2018
An `aws.serverless.Function` can now accept a list of include paths, which will be included inside the Lambda package along with the generated `__index.js` file

Also changes the default from '.' to './node_modules', so that the very common case of changes to the root folder do not trigger changes to the code for lambdas.  If files from the local folder are needed by the lambda, they can be included in the `includePaths` parameter.

Also adds support for providing a pre-created Role instead of just Policies, to support scenarios where users want to pre-create or share a Role across multiple functions.

Fixes #35.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
impact/performance Something is slower than expected impact/reliability Something that feels unreliable or flaky
Projects
None yet
Development

No branches or pull requests

7 participants