Build executables in stack_snapshot and use them as tools #1306

aherrmann · 2020-04-20T14:44:21Z

Is your feature request related to a problem? Please describe.

Some of the packages defined by a call to stack_snapshot depend on executables (tools) that themselves are Hackage packages. Examples include libraries like language-c which depends on alex and happy, or grpc-haskell-core which depends on c2hs. Furthermore, some packages provide executables that the user of stack_snapshot will likely wish to invoke directly, e.g. doctest or ghcide.

Since #907 stack_snapshot provides the tools attribute which allows to pass executables that are built outside of stack_snapshot in to be available as build tools for the packages defined by stack_snapshot. This approach is very limited:

Every package depends on the same set of tools, which means that changing any tools or adding additional tools requires a full rebuild of all packages.
This does not scale well for tools that appear in the middle of the dependency graph. c2hs is such an example, it itself depends on Hackage packages like language-c, which in turn depends on alex and happy. I.e. to build a package that depends on c2hs, say haskell-grpc-core we need to provide alex and happy, then a stack_snapshot for c2hs dependencies, then build c2hs itself, and then finally a stack_snapshot for haskell-grpc-core and other packages. I.e. in a typical use-case we end up with external workspaces @alex, @happy, @c2hs_deps, @c2hs, and @stackage.
This is very inconvenient for packages with many dependencies that provide executables. E.g. ghcide has many dependencies and provides a library and an executable. If we wish to build it with stack_snapshot we need to define a stack_snapshot exporting ghcide the library and all dependencies of ghcide the binary, and then define a separate external workspace for the ghcide binary where we import all these dependencies. This incurs a lot of boilerplate and we have to be careful to keep the versions/revisions in sync between stack_snapshot and the http_archive used for the haskell_cabal_binary call. Alternatively, we could build both the library and binary in one http_archive but then we have to manually perform the dependency resolution for ghcide.

Describe the solution you'd like

I'd like stack_snapshot to build both library and executable components of packages and integrate them into the dependency graph as appropriate. I.e. we shouldn't have to provide alex, happy, c2hs, ... to the tools attribute of stack_snapshot. And we shouldn't have to resort to intermediate stack_snapshot invocations as required for tools like c2hs.

As of now stack does not provide us any information about the components provided by a package. And we would prefer to avoid implementing a Cabal file parser in Starlark. For that reason I'd propose to have users define the components of the packages they expect to be built, with a predefined set for common packages like alex, happy, .... In the end this would look something like this

stack_snapshot(
    name = "stackage",
    packages = [
        "haskell-grpc-core",
        "ghcide",
    ]
    components = {
        "ghcide": ["lib", "exe"],  # Where `exe` is short for `exe:<package_name>`
        # The following are common and included in `rules_haskell`
        # "alex": ["exe"],
        # "happy": ["exe"],
        # "c2hs": ["exe"],
    },
)

Describe alternatives you've considered

Hazel used to parse Cabal files to automatically determine which components are available. However, this introduces a fair amount more complexity as seems necessary that we would prefer to avoid. In any case this only addresses the question of whether the components attribute should be user defined or inferred.
stack ls dependencies json could possibly be patched to include information on package components. However, it is unclear how much effort that would require. Also, parsing JSON into Starlark is not without issues either. As above this only addresses whether components should be user defined or inferred.

Additional context

The issue was discussed briefly in Add a tools argument to Stack and Cabal rules #907
stack_snapshot use executable components in tools attribute #1305, stack_snapshot build executable components #1304, haskell_cabal_binary allow target name and executable to differ #1303 provides a possible implementation.
A discussion around the naming of the generated library and executable targets was started in haskell_cabal_binary allow target name and executable to differ #1303 and will be continued here.

The text was updated successfully, but these errors were encountered:

aherrmann · 2020-04-20T14:55:58Z

Addressing the comment in #1303 (comment)

Arguably binaries should go in a separate namespace, because their names could clash with a library name. I'd have,
@stackage//:doctest
@stackage-bin//:doctest

That would work around name clashes, however, I'm not sure it's worth the complexity. With libraries and binaries in separate workspaces we would have to fetch the Cabal sources twice, which seems wasteful, and would have to generate criss-cross dependencies between these two workspaces for cases like c2hs.

Which is easier to remember than any mangling of the binary name. Now, you could ask, what happens if two packages export a binary with the same name? That's a situation Cabal doesn't handle well, but we can do better. For some reason that I don't remember, we chose to have @stackage//:doctest instead of @stackage//doctest:doctest (I think this was discussed in the original ticket). However, if we switched to the latter, then we'd have a reasonable scheme for both binaries and packages that export multiple libraries (which Cabal now allows). We'd have e.g.
@stackage//doctest:doctest
@stackage//doctest/exe:doctest

Sure, that would be an option. It's worth pointing out that some of these targets would have to be aliases, if we don't want to duplicate the package sources multiple times. Currently stack_snapshot fetches the doctest sources into a directory doctest-0.16.3. This is fine, because at the moment the corresponding haskell_cabal_library is defined at the top-level and can take srcs = glob(["doctest-0.16.3/**"]). The package hierarchy matters because Cabal has to build in the directory containing the .cabal file. Just as one possible approach, with the layout proposed above we could put the sources into doctest/src, then define all targets in the doctest package, i.e. @stackage//doctest:doctest and @stackage//doctest:doctest_exe_doctest, but then define friendlier exe aliases @stackage//doctest/exe:doctest.

Doesn't look so nice. But how about,

@stackage//doctest:doctest
@stackage//doctest:another-public-library
@stackage-bin//doctest:doctest

As mentioned above, I'm not convinced that separating binaries and libraries into different workspaces doesn't introduce more problems than it solves.

mboes · 2020-05-27T06:42:25Z

That would work around name clashes, however, I'm not sure it's worth the complexity. With libraries and binaries in separate workspaces we would have to fetch the Cabal sources twice, which seems wasteful,

That's not the only way to do it. Assuming we have a way to determine which packages have executables (this is a problem either way, and contingent on commercialhaskell/stack#5275), then e.g. stackage-bin//:alex or stackage-bin//:hspec-discover can be aliases to internal targets not exposed to the user. These internal targets can have whatever name we like, and it no longer matters how horrible they look (like stackage//hspec-discover/exe:hspec-discover). The targets can even have names mangled so badly as to be unrecognizable. The user only needs to know the alias.

and would have to generate criss-cross dependencies between these two workspaces for cases like c2hs.

This is again not necessarily the case I believe, because stackage-bin//... can be aliases. But even if it were the case, this should be fine for Bazel. Repositories are only a way to partition source code, but the dependency graph is allowed to cut across all repositories arbitrarily.

Just as one possible approach, with the layout proposed above we could put the sources into doctest/src, then define all targets in the doctest package, i.e. @stackage//doctest:doctest and @stackage//doctest:doctest_exe_doctest, but then define friendlier exe aliases @stackage//doctest/exe:doctest.

... or exe aliases elsewhere. That looks good to me: mangled target names internally if that's what's necessary for technical reasons, and aliases to give short, memorable names to the user.

As mentioned above, I'm not convinced that separating binaries and libraries into different workspaces doesn't introduce more problems than it solves.

Fair enough. I think the problems that it introduces are essentially vacuous, because AFAICS these can be aliases. And the interface to the user looks quite a bit nicer to me. No mangling, which requires to learn and to remember the mangling scheme, and no unholy mix of / and : that are all too easy to swap around. You want a binary? Sure! In the common case it has a simple name and unambiguous name: @stackage-bin//hspec-discover. Only binaries in non-eponymous packages have longer names.

aherrmann · 2020-05-29T15:44:34Z

That's not the only way to do it. Assuming we have a way to determine which packages have executables (this is a problem either way, and contingent on commercialhaskell/stack#5275), then e.g. stackage-bin//:alex or stackage-bin//:hspec-discover can be aliases to internal targets not exposed to the user.

Yes, with aliases we are free to expose libraries and binaries to the user under any name.

Just as one possible approach, with the layout proposed above we could put the sources into doctest/src, then define all targets in the doctest package, i.e. @stackage//doctest:doctest and @stackage//doctest:doctest_exe_doctest, but then define friendlier exe aliases @stackage//doctest/exe:doctest.

... or exe aliases elsewhere. That looks good to me: mangled target names internally if that's what's necessary for technical reasons, and aliases to give short, memorable names to the user.

Great, #1303 is in line with that approach.

As mentioned above, I'm not convinced that separating binaries and libraries into different workspaces doesn't introduce more problems than it solves.

Fair enough. I think the problems that it introduces are essentially vacuous, because AFAICS these can be aliases. And the interface to the user looks quite a bit nicer to me. No mangling, which requires to learn and to remember the mangling scheme, and no unholy mix of / and : that are all too easy to swap around. You want a binary? Sure! In the common case it has a simple name and unambiguous name: @stackage-bin//hspec-discover. Only binaries in non-eponymous packages have longer names.

Sounds good to me.

On a side-note, aliases are not without cost either. In particular, bazel query interacts poorly with aliases as it considers aliases targets of their own and last I checked there's no flag to make it resolve aliases during query. For example the vendored_packages aliases that we produce in stack_snapshot break hrepl. To avoid breaking hrepl more we should try to keep the haskell_cabal_library targets in their current place and only use aliases for executables.

aherrmann added the type: feature request label Apr 20, 2020

This was referenced Apr 20, 2020

haskell_cabal_binary allow target name and executable to differ #1303

Merged

Feature Request: Expose Cabal components in ls dependencies json commercialhaskell/stack#5275

Open

This was referenced Jun 5, 2020

stack_snapshot build executable components #1304

Merged

stack_snapshot use executable components in tools attribute #1305

Merged

mergify bot closed this as completed in #1305 Jun 10, 2020

aherrmann mentioned this issue Jun 10, 2020

stackage_snapshot: generate haskell_binary targets for requested binaries #1006

Closed

aherrmann mentioned this issue Aug 23, 2021

Automatic visibility of stack_snapshot dependencies to vendored packages #1583

Merged

ylecornec mentioned this issue Nov 24, 2021

sublibraries in stack_snapshot rule #1636

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build executables in stack_snapshot and use them as tools #1306

Build executables in stack_snapshot and use them as tools #1306

aherrmann commented Apr 20, 2020 •

edited

Loading

aherrmann commented Apr 20, 2020

mboes commented May 27, 2020

aherrmann commented May 29, 2020

Build executables in stack_snapshot and use them as tools #1306

Build executables in stack_snapshot and use them as tools #1306

Comments

aherrmann commented Apr 20, 2020 • edited Loading

aherrmann commented Apr 20, 2020

mboes commented May 27, 2020

aherrmann commented May 29, 2020

aherrmann commented Apr 20, 2020 •

edited

Loading