Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Fix test-install by moving tests into module #1232

Merged
merged 96 commits into from
May 25, 2018

Conversation

nelson-liu
Copy link
Contributor

@nelson-liu nelson-liu commented May 17, 2018

Goals of this PR:

  1. 1. Prevent tests from being installed as a module in python setup.py . Currently, pip copies them to site-packages, which is definitely not intended behavior.
  2. 2. installation method-invariant way to load and run tests, especially those that depend on files outside of the module (e.g. scripts or tutorials). The issue is that files can be in different places depending on if allennlp is installed with pip or setup.py install.

For 2: an easy solution might be to, when running the test-install command, simply download and cache a copy of the allennlp source for the appropriate version at ~/.allennlp . Note that this wouldn't work for non-release versions, but I think that's a good thing --- if you're working on master or developing features, you shouldn't be using test-install anyway.

Thoughts? I'm not sure how i feel about the above solution, so feedback would be great before i write the code.


Edit: another solution is to just move the tests into the allennlp module (this is a fairly standard way of structuring projects anyway). The downside to this is that it still doesn't enable using allennlp test-install to run tests in tutorials, as well as the tests that depend on files outside the module (e.g. SRL test that compares output of evaluator with a perl script). But maybe we shouldn't be running those tests anyway in test-install? or we can move the perl script into the tests and then just not run the notebook tests, since we'll be rewriting them anyway.

This solution is rather time-intensive, though, since it involves fixing all of the paths that are hardcoded into the tests. Should not be too hard to write a script to go through and change the paths, minimizing human labor to just a check up...

edit: fixes #1228

@nelson-liu
Copy link
Contributor Author

nelson-liu commented May 17, 2018

Another thing we could do is move the tests within the allennlp module...I know of at least a few others packages that do this (e.g., scikit-learn and pandas. numpy and scipy also put tests within each module). Maybe this solution is a bit drastic? I don't think too much depends on where the tests are actually located (maybe some of the CI scripts? But py.test in the root should continue to work)

nevermind this still doesn't solve the fundamental issue of having no way to reliably access ./scripts and ./tutorials in the tests unless we download the whole source. But do we even want to do that? 🤔

@nelson-liu nelson-liu changed the title [WIP] reworking test-install command [WIP] Fix test-install by moving tests into module May 18, 2018
@nelson-liu
Copy link
Contributor Author

nelson-liu commented May 18, 2018

I went ahead and made the change, was a lot less work than I anticipated. even if we don't want to move tests within the module, a lot of these commits can be cherry-picked for an eventual windows-compatibility PR.

maybe best to look at this after EMNLP...The diff looks like there are a lot of changes, but 1.9K of the added lines are from adding srl-eval.pl to the allennlp/tools directory.

@nelson-liu nelson-liu changed the title [WIP] Fix test-install by moving tests into module Fix test-install by moving tests into module May 18, 2018
@schmmd
Copy link
Member

schmmd commented May 21, 2018

@nelson-liu what's the difference between tools and scripts?

@nelson-liu
Copy link
Contributor Author

nelson-liu commented May 21, 2018

  1. scripts are standalone, one-off entities, while tools are core to some other component of the package (e.g. evalb is essential to the constituency parser metric, and the srl perl eval script is used in the tests). In general, if some outside tool/script is necessary to make a component of the library function properly, it should go into tools.

  2. since scripts are standalone, they go outside of the package. Tools are explicitly called by things inside the package, so they should go inside the package as well --- this lets the package access them after being pip-installed, which seems important.

I named them differently to make it easier to distinguish. If you have better names for them, I'd be happy to change! I agree that it can be confusing...

@nelson-liu
Copy link
Contributor Author

(let me know if there's anything I can do to make this easier to review, as well).

@schmmd
Copy link
Member

schmmd commented May 22, 2018

@nelson-liu I think the names are fine. A brief (1 sentence) README.md in each folder would be great, defining the role of each. Overall, this seems like a lot of improvements, but also a lot of changes. As it touches the whole codebase I'm not sure if I'm the best person to code review it.

@@ -60,7 +61,7 @@ def filename_to_url(filename: str, cache_dir: str = None) -> Tuple[str, str]:

return url, etag

def cached_path(url_or_filename: str, cache_dir: str = None) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joelgrus does this Union, and all the str(path) calls in this PR change how you feel about Path?

@DeNeutoy
Copy link
Contributor

I'm still unsure whether it's desirable to have users run tests. Can you point me to some other python projects which do this?

@nelson-liu
Copy link
Contributor Author

nelson-liu commented May 23, 2018

NumPy / SciPy lets users run their tests through the CLI (https://stackoverflow.com/a/9200923/2544124):

import numpy
numpy.test()

(and accordingly for scipy)

Also theano

@schmmd
Copy link
Member

schmmd commented May 24, 2018

For another motivating use case, I'd like to create a Docker image that downstream users can use to run Beaker. It would look like @cristipp 's and would install AllenNLP via pip. Having the ability to test the pip distribution would be invaluable for making sure the image works as expected.

from allennlp.commands.test_install import _get_module_root


class TestTestInstall(AllenNlpTestCase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the motivation for this test? A brief comment would be helpful. I'm confused because it's a test, in the tests folder, that's checking to see whether the tests folder exists.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is a rename rather than a new file. While that doesn't change my confusion, you don't need to address it as part of this PR.

Copy link
Contributor Author

@nelson-liu nelson-liu May 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this is quite confusing. The context is that when a user runs test-install, we have no idea where they're running it from, so we do an os.chdir to the module root in order to get all the paths in the fixtures to resolve properly.

The logic within test-install is pretty hard to test in its entirety (see discussion here), so this test is basically a test for that os.chdir component, checking that we correctly find the path to chdir to.

does that help? I'll add this as a comment too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that helps!

Copy link
Member

@schmmd schmmd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great--just a couple of comments.

Also, srl-eval.pl is now at allennlp/tools/srl-eval.pl and scripts/srl-eval.pl. We should probably just move the latter.

@nelson-liu in the future I think this would have been reviewed more promptly if you had broken it up into smaller PRs. You accomplish a few things in this PR:

  1. Moving tests to allennlp/tests.
  2. Improving the cleanup in wikitables tests.
  3. Changing how we write paths in the tests.

@@ -29,7 +31,8 @@ def test_accuracy_is_scored_correctly(self):
logical_form = ('((reverse fb:row.row.year) (fb:row.row.index (max '
'((reverse fb:row.row.index) (fb:row.row.league fb:cell.usl_a_league)))))')

wikitables_accuracy = WikiTablesAccuracy(table_directory='tests/fixtures/data/wikitables/')
print(os.getcwd())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nelson-liu is this an intentional print?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not, thanks for catching that!

[mypy]

[mypy-allennlp.tests*]
ignore_errors = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? Did mypy previously not run on our tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, mypy was previously not being run on the testing code --- I thought this was intentional?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you made the right change in this PR then. I'm not sure what we want long term.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nelson-liu, do the mypy errors on the tests look helpful? I'd guess that we don't care about them, because it's not as important to have clear type annotations in the tests, but if they look useful we could reconsider.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matt-gardner here are the mypy errors: https://gist.github.com/nelson-liu/f94a96265973a390066f45399718fe5e . I guess they're not entirely useless? Perhaps we should address in another PR though, this one is already huge.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah, ok, I'll open that as an issue. Thanks for doing all of this!

from allennlp.commands.test_install import _get_module_root


class TestTestInstall(AllenNlpTestCase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is a rename rather than a new file. While that doesn't change my confusion, you don't need to address it as part of this PR.

@schmmd schmmd self-assigned this May 25, 2018
@nelson-liu
Copy link
Contributor Author

Thanks for taking a look, I agree that this PR is difficult to review. The main issue is that (2) and (3) are actually not problems unless (1) is done, but in order for (1) to work properly at all (2) and (3) have to be achieved.

I basically just moved the tests, ran them, and started fixing the errors.

@schmmd
Copy link
Member

schmmd commented May 25, 2018

@nelson-liu sounds great. If you can move srl-eval.pl instead of creating a new copy, then I'll approve and merge.

@nelson-liu
Copy link
Contributor Author

nelson-liu commented May 25, 2018

before you merge, i'd like to run some last sanity-check tests on the code in this branch (building it from github source with pip on a clean environment and seeing if everything works as expected). I've tested it locally, and it seems to work, but i want to verify with a GPU as well. I'll let you know when that's done.

Maybe this is something you might want to add to CI as well? Not sure.

@DeNeutoy
Copy link
Contributor

I was semi-serious about not liking the Path thing - I think it will just cause us to have loads of str(path) things everywhere and doesn't really add much. I'd kinda prefer it if this PR switched back to using string paths.... @joelgrus lets argue

@DeNeutoy
Copy link
Contributor

Meh maybe i'm just being grumpy and un-modern - I didn't realise it might fix some of the windows compatability issues. If that's the case let's go for it.

@matt-gardner
Copy link
Contributor

@DeNeutoy, I agree that the only benefit I see to using the Path stuff is better windows compatibility. Otherwise just using a string path is simpler and easier to read.

@nelson-liu
Copy link
Contributor Author

I'll merge this when CI passes, unless there's anything further to be discussed?

I pip installed allennlp from this branch on an osx machine and a linux gpu machine, and allennlp test-install passed on both hosts.

@joelgrus
Copy link
Contributor

to @DeNeutoy 's point, calling str() on pathlib.Path objects is a bad code smell. if the places we need those paths can't accept pathlib.Path object then I would rather just use strings.

(my ideal solution would be that those places accept Path objects tho)

@nelson-liu nelson-liu merged commit 59a0e5a into allenai:master May 25, 2018
@nelson-liu
Copy link
Contributor Author

thanks all for taking a look

@schmmd schmmd mentioned this pull request May 31, 2018
gabrielStanovsky pushed a commit to gabrielStanovsky/allennlp that referenced this pull request Sep 7, 2018
fixes allenai#1228

Goals of this PR:
1. [x] 1. Prevent tests from being installed as a module in `python setup.py` . Currently, pip copies them to `site-packages`, which is definitely not intended behavior.

2. [x] 2. installation method-invariant way to load and run tests, especially those that depend on files outside of the module (e.g. `scripts` or `tutorials`). The issue is that files can be in different places depending on if allennlp is installed with `pip` or `setup.py install`. This is done by moving the tests within the module and using os.chdir to make the paths in the fixtures properly resolve.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

running allennlp test-install after installing with pip leads to many errors
5 participants