Generating geojson files on run command #184

swaradgat19 · 2023-08-14T23:50:46Z

fixes #181

…ion using tqdm

swaradgat19 · 2023-08-15T00:28:06Z

I'm currently updating the tests. Will push them as and when I resolve them

swaradgat19 · 2023-08-15T14:37:46Z

@kaczmarj Since we've updated the result directories (model-outputs-csv/geojson), we will have to update the tests such that we assert that csv and json files be stored in tmp_path/model-outputs-csv and tmp_path/model-outputs-geojson directories right? Just trying to get an intuition so that I can modify the tests accordingly.

swaradgat19 · 2023-08-15T15:02:03Z

Updated the tests. All are passing except one. In the test test_issue_97, when we are running the command again using runner.invoke, it fails because the output directory already exists (for geojson). Perhaps we can let it generate the resulting geojson directory again? Or should I handle it in the test itself?

kaczmarj · 2023-08-15T15:12:12Z

Since we've updated the result directories (model-outputs-csv/geojson), we will have to update the tests such that we assert that csv and json files be stored in tmp_path/model-outputs-csv and tmp_path/model-outputs-geojson directories right?

Yes that is correct.

it fails because the output directory already exists (for geojson).

i don't see this error in the github actions logs. what is the traceback?

swaradgat19 · 2023-08-15T15:17:33Z

It was getting raised because we are checking whether the output directory exists or not. If it was, we were raising the FileExistsError (instead of the Click.Exceptions I believe).

def parallelize_geojson(csvs: list, results_dir: Path, num_workers: int) -> None:
    output = results_dir / "model-outputs-geojson"

    if not results_dir.exists():
        raise FileExistsError(f"results_dir does not exist: {results_dir}")
    if output.exists():
        # raise FileExistsError("Output directory already exists.")
        shutil.rmtree(f"{output}")
# rest of the code

To handle that, I'm just deleting the directory if it already exists(using shutil) and then it is getting created again below. I'm doing this so that the test passes, although we would want to change this.

kaczmarj · 2023-08-15T15:29:05Z

i see. so what we do for model outputs typically is skip any slides that already have model ouptut CSVs that exist. we should implement the same behavior for the geojson conversion.

so in the list of csvs to be converted, we should remove any that already exist as geojson. so existing geojsons will not be touched.

swaradgat19 · 2023-08-15T15:35:27Z

Got it. I'll make the changes

wsinfer/write_geojson.py

swaradgat19 · 2023-08-16T20:46:09Z

@kaczmarj Not entirely sure why the pytorch-nightly test is failing. Might be an issue with slide_path perhaps?

kaczmarj · 2023-08-16T21:07:34Z

i think there are two issues.

the style tests are failing. to fix that, run isort and black on the code to format the code.
to fix the pytorch nightly test, i think we need to check that a certain variable is not None.

wsinfer/wsinfer/wsi.py

Line 225 in b05b7ee

page0 = series0[0]

add the following between lines 225 and 226

if page0 is None:
    raise CannotReadSpacing()

swaradgat19 · 2023-08-17T22:22:22Z

Tried a try-except too. Didn't work

kaczmarj · 2023-08-18T01:12:50Z

i will take a look at this. it could be that something in the tifffile has changed slightly

kaczmarj · 2023-09-15T18:48:10Z

i'll review this pr soon. in the meantime, can you please merge the main branch into your branch? i made a few fixes in #188 . you will also have to resolve a merge conflict with wsinfer/wsi.py.

kaczmarj

i left a few change requests. thanks for working on this @swaradgat19

kaczmarj · 2023-09-15T19:21:09Z

wsinfer/write_geojson.py

 import uuid
+from functools import partial
 from pathlib import Path

 import click


click isn't used here so we can remove

Suggested change

import click

kaczmarj · 2023-09-15T19:22:03Z

wsinfer/write_geojson.py

-def convert(input: str | Path, output: str | Path) -> None:
-    df = pd.read_csv(input)
+def make_geojson(csv: Path, results_dir: Path) -> None:
+    filename = csv.stem


nit pick but could you rename this to slide_id ?

kaczmarj · 2023-09-15T19:26:07Z

wsinfer/cli/infer.py

@@ -374,3 +375,6 @@ def run(
        json.dump(run_metadata, f, indent=2)

    click.secho("Finished.", fg="green")
+
+    csvs = list((results_dir / "model-outputs-csv").glob("*.csv"))
+    write_geojsons(csvs, results_dir, num_workers)


the geojson writing should happen in the fille run_inference.py -- see

wsinfer/wsinfer/modellib/run_inference.py

Line 194 in 123d7c1

slide_df.to_csv(slide_csv, index=False)

for where the CSVs are written. The GeoJSON conversion can happen right after that.

kaczmarj · 2023-09-15T19:28:35Z

wsinfer/write_geojson.py

-
-    output.mkdir(exist_ok=False)
+        # Makes a list of filenames for both geojsons and csvs
+        geojson_filenames = [filename.stem for filename in geojsons]


a nit.. instead of using filenames in the variable name, could you use slide_ids ? it will make it more obvious that these are slide IDs.

swaradgat19 · 2023-09-15T23:42:01Z

i left a few change requests. thanks for working on this @swaradgat19

Sure @kaczmarj ! I was actually trying to merge the main into my main branch, but ran into issues ( Github isn't allowing me to sync my forked repo because I'm 1 commit behind and 13 commits ahead of SBU-BMI/wsinfer). I've created a new branch fix/geojson_command with #188 included. Should I open a new PR with that branch?

kaczmarj · 2023-09-16T22:14:22Z

that's fine, let's continue the discussion in #191 . in the future, you can fix this sort of "merge conflict" on the command line. here are some docs that should help https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/addressing-merge-conflicts/resolving-a-merge-conflict-using-the-command-line

closing because this is replaced by #191

swaradgat19 and others added 3 commits August 12, 2023 00:12

generated geojson outputs by default in wsinfer run command

314e1d2

Merge branch 'SBU-BMI:main' into main

0146268

generating geojson files on run command, parallelized geojson convert…

b9b6244

…ion using tqdm

changed tests according to updated command

afde53f

modified open() with separate filename variable

4a37e3b

swaradgat19 added 3 commits August 15, 2023 11:40

changed make_geojson parameter csv from a string to a Path variable

ebb00f7

convert only new csv files to geojson instead of all

ee116d0

handled condition if output directory doesn't exist

e606e5e

kaczmarj reviewed Aug 16, 2023

View reviewed changes

wsinfer/write_geojson.py Outdated Show resolved Hide resolved

kaczmarj requested changes Aug 16, 2023

View reviewed changes

changed function name to write_geojsons, fixed minor issues

636c475

swaradgat19 added 4 commits August 16, 2023 17:17

raise CannotReadSpacing if page0 is none, fixed style issue

61a00df

order issue fixed

0645502

added a try catch to handle pytorch-nightly error

800b8b2

styled using isort and black

9c33ebc

kaczmarj mentioned this pull request Sep 15, 2023

check that page0 is TiffPage inst #188

Merged

kaczmarj requested changes Sep 15, 2023

View reviewed changes

kaczmarj closed this Sep 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generating geojson files on run command #184

Generating geojson files on run command #184

swaradgat19 commented Aug 14, 2023 •

edited

Loading

swaradgat19 commented Aug 15, 2023

swaradgat19 commented Aug 15, 2023

swaradgat19 commented Aug 15, 2023

kaczmarj commented Aug 15, 2023

swaradgat19 commented Aug 15, 2023 •

edited

Loading

kaczmarj commented Aug 15, 2023

swaradgat19 commented Aug 15, 2023

swaradgat19 commented Aug 16, 2023 •

edited

Loading

kaczmarj commented Aug 16, 2023

swaradgat19 commented Aug 17, 2023 •

edited

Loading

kaczmarj commented Aug 18, 2023

kaczmarj commented Sep 15, 2023

kaczmarj left a comment

kaczmarj Sep 15, 2023

kaczmarj Sep 15, 2023

kaczmarj Sep 15, 2023

kaczmarj Sep 15, 2023

swaradgat19 commented Sep 15, 2023

kaczmarj commented Sep 16, 2023

Generating geojson files on run command #184

Generating geojson files on run command #184

Conversation

swaradgat19 commented Aug 14, 2023 • edited Loading

swaradgat19 commented Aug 15, 2023

swaradgat19 commented Aug 15, 2023

swaradgat19 commented Aug 15, 2023

kaczmarj commented Aug 15, 2023

swaradgat19 commented Aug 15, 2023 • edited Loading

kaczmarj commented Aug 15, 2023

swaradgat19 commented Aug 15, 2023

swaradgat19 commented Aug 16, 2023 • edited Loading

kaczmarj commented Aug 16, 2023

swaradgat19 commented Aug 17, 2023 • edited Loading

kaczmarj commented Aug 18, 2023

kaczmarj commented Sep 15, 2023

kaczmarj left a comment

Choose a reason for hiding this comment

kaczmarj Sep 15, 2023

Choose a reason for hiding this comment

kaczmarj Sep 15, 2023

Choose a reason for hiding this comment

kaczmarj Sep 15, 2023

Choose a reason for hiding this comment

kaczmarj Sep 15, 2023

Choose a reason for hiding this comment

swaradgat19 commented Sep 15, 2023

kaczmarj commented Sep 16, 2023

swaradgat19 commented Aug 14, 2023 •

edited

Loading

swaradgat19 commented Aug 15, 2023 •

edited

Loading

swaradgat19 commented Aug 16, 2023 •

edited

Loading

swaradgat19 commented Aug 17, 2023 •

edited

Loading