Skip to content

Commit

Permalink
Cleanup of regression script (#1179)
Browse files Browse the repository at this point in the history
  • Loading branch information
lintool committed May 11, 2020
1 parent 693eb5d commit cee4463
Show file tree
Hide file tree
Showing 31 changed files with 266 additions and 270 deletions.
10 changes: 5 additions & 5 deletions docs/regressions-backgroundlinking18.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ Typical indexing command:

```
nohup sh target/appassembler/bin/IndexCollection -collection WashingtonPostCollection -input /path/to/backgroundlinking18 \
-index lucene-index.backgroundlinking18.pos+docvectors+rawdocs -generator WashingtonPostGenerator -threads 1 \
-storePositions -storeDocvectors -storeRaw >& log.backgroundlinking18.pos+docvectors+rawdocs &
-index indexes/lucene-index.core18.pos+docvectors+raw -generator WashingtonPostGenerator -threads 1 \
-storePositions -storeDocvectors -storeRaw >& logs/log.backgroundlinking18.pos+docvectors+rawdocs &
```

The directory `/path/to/core18/` should be the root directory of the [TREC Washington Post Corpus](https://trec.nist.gov/data/wapost/), i.e., `ls /path/to/core18/`
Expand All @@ -29,15 +29,15 @@ Topics and qrels are stored in [`src/main/resources/topics-and-qrels/`](../src/m
After indexing has completed, you should be able to perform retrieval as follows:

```
nohup target/appassembler/bin/SearchCollection -index lucene-index.backgroundlinking18.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.core18.pos+docvectors+raw \
-topicreader BackgroundLinking -topics src/main/resources/topics-and-qrels/topics.backgroundlinking18.txt \
-backgroundlinking -backgroundlinking.k 100 -bm25 -hits 100 -output run.backgroundlinking18.bm25.topics.backgroundlinking18.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.backgroundlinking18.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.core18.pos+docvectors+raw \
-topicreader BackgroundLinking -topics src/main/resources/topics-and-qrels/topics.backgroundlinking18.txt \
-backgroundlinking -backgroundlinking.k 100 -bm25 -rm3 -hits 100 -output run.backgroundlinking18.bm25+rm3.topics.backgroundlinking18.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.backgroundlinking18.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.core18.pos+docvectors+raw \
-topicreader BackgroundLinking -topics src/main/resources/topics-and-qrels/topics.backgroundlinking18.txt \
-backgroundlinking -backgroundlinking.datefilter -backgroundlinking.k 100 -bm25 -rm3 -hits 100 -output run.backgroundlinking18.bm25+rm3+df.topics.backgroundlinking18.txt &
```
Expand Down
10 changes: 5 additions & 5 deletions docs/regressions-backgroundlinking19.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ Typical indexing command:

```
nohup sh target/appassembler/bin/IndexCollection -collection WashingtonPostCollection -input /path/to/backgroundlinking19 \
-index lucene-index.backgroundlinking19.pos+docvectors+rawdocs -generator WashingtonPostGenerator -threads 1 \
-storePositions -storeDocvectors -storeRaw >& log.backgroundlinking19.pos+docvectors+rawdocs &
-index indexes/lucene-index.core18.pos+docvectors+raw -generator WashingtonPostGenerator -threads 1 \
-storePositions -storeDocvectors -storeRaw >& logs/log.backgroundlinking19.pos+docvectors+rawdocs &
```

The directory `/path/to/core18/` should be the root directory of the [TREC Washington Post Corpus](https://trec.nist.gov/data/wapost/), i.e., `ls /path/to/core18/`
Expand All @@ -29,15 +29,15 @@ Topics and qrels are stored in [`src/main/resources/topics-and-qrels/`](../src/m
After indexing has completed, you should be able to perform retrieval as follows:

```
nohup target/appassembler/bin/SearchCollection -index lucene-index.backgroundlinking19.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.core18.pos+docvectors+raw \
-topicreader BackgroundLinking -topics src/main/resources/topics-and-qrels/topics.backgroundlinking19.txt \
-backgroundlinking -backgroundlinking.k 100 -bm25 -hits 100 -output run.backgroundlinking19.bm25.topics.backgroundlinking19.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.backgroundlinking19.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.core18.pos+docvectors+raw \
-topicreader BackgroundLinking -topics src/main/resources/topics-and-qrels/topics.backgroundlinking19.txt \
-backgroundlinking -backgroundlinking.k 100 -bm25 -rm3 -hits 100 -output run.backgroundlinking19.bm25+rm3.topics.backgroundlinking19.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.backgroundlinking19.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.core18.pos+docvectors+raw \
-topicreader BackgroundLinking -topics src/main/resources/topics-and-qrels/topics.backgroundlinking19.txt \
-backgroundlinking -backgroundlinking.datefilter -backgroundlinking.k 100 -bm25 -rm3 -hits 100 -output run.backgroundlinking19.bm25+rm3+df.topics.backgroundlinking19.txt &
```
Expand Down
16 changes: 8 additions & 8 deletions docs/regressions-car17v1.5.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ Typical indexing command:

```
nohup sh target/appassembler/bin/IndexCollection -collection CarCollection -input /path/to/car17v1.5 \
-index lucene-index.car17v1.5.pos+docvectors+rawdocs -generator DefaultLuceneDocumentGenerator -threads 1 \
-storePositions -storeDocvectors -storeRaw >& log.car17v1.5.pos+docvectors+rawdocs &
-index indexes/lucene-index.car17v1.5.pos+docvectors+raw -generator DefaultLuceneDocumentGenerator -threads 1 \
-storePositions -storeDocvectors -storeRaw >& logs/log.car17v1.5.pos+docvectors+rawdocs &
```

The directory `/path/to/car17v1.5` should be the root directory of Complex Answer Retrieval (CAR) paragraph corpus (v1.5), which can be downloaded [here](http://trec-car.cs.unh.edu/datareleases/).
Expand All @@ -30,27 +30,27 @@ Specifically, this is the section-level passage retrieval task with automatic gr
After indexing has completed, you should be able to perform retrieval as follows:

```
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v1.5.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v1.5.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt \
-bm25 -output run.car17v1.5.bm25.topics.car17v1.5.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v1.5.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v1.5.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt \
-bm25 -rm3 -output run.car17v1.5.bm25+rm3.topics.car17v1.5.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v1.5.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v1.5.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt \
-bm25 -axiom -axiom.deterministic -rerankCutoff 20 -output run.car17v1.5.bm25+ax.topics.car17v1.5.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v1.5.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v1.5.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt \
-qld -output run.car17v1.5.ql.topics.car17v1.5.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v1.5.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v1.5.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt \
-qld -rm3 -output run.car17v1.5.ql+rm3.topics.car17v1.5.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v1.5.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v1.5.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt \
-qld -axiom -axiom.deterministic -rerankCutoff 20 -output run.car17v1.5.ql+ax.topics.car17v1.5.benchmarkY1test.txt &
```
Expand Down
16 changes: 8 additions & 8 deletions docs/regressions-car17v2.0-doc2query.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ Typical indexing command:

```
nohup sh target/appassembler/bin/IndexCollection -collection JsonCollection -input /path/to/car17v2.0-doc2query \
-index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs -generator DefaultLuceneDocumentGenerator -threads 30 \
-storePositions -storeDocvectors -storeRaw >& log.car17v2.0-doc2query.pos+docvectors+rawdocs &
-index indexes/lucene-index.car17v2.0-doc2query.pos+docvectors+raw -generator DefaultLuceneDocumentGenerator -threads 30 \
-storePositions -storeDocvectors -storeRaw >& logs/log.car17v2.0-doc2query.pos+docvectors+rawdocs &
```

The directory `/path/to/car17v2.0-doc2query` should be the root directory of Complex Answer Retrieval (CAR) paragraph corpus (v2.0) that has been augmented with the doc2query expansions, i.e., `collection_jsonl_expanded_topk10/` as described in [this page](experiments-doc2query.md).
Expand All @@ -36,27 +36,27 @@ Specifically, this is the section-level passage retrieval task with automatic gr
After indexing has completed, you should be able to perform retrieval as follows:

```
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v2.0-doc2query.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-bm25 -output run.car17v2.0-doc2query.bm25.topics.car17v2.0.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v2.0-doc2query.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-bm25 -rm3 -output run.car17v2.0-doc2query.bm25+rm3.topics.car17v2.0.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v2.0-doc2query.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-bm25 -axiom -axiom.deterministic -rerankCutoff 20 -output run.car17v2.0-doc2query.bm25+ax.topics.car17v2.0.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v2.0-doc2query.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-qld -output run.car17v2.0-doc2query.ql.topics.car17v2.0.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v2.0-doc2query.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-qld -rm3 -output run.car17v2.0-doc2query.ql+rm3.topics.car17v2.0.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0-doc2query.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v2.0-doc2query.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-qld -axiom -axiom.deterministic -rerankCutoff 20 -output run.car17v2.0-doc2query.ql+ax.topics.car17v2.0.benchmarkY1test.txt &
```
Expand Down
16 changes: 8 additions & 8 deletions docs/regressions-car17v2.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ Typical indexing command:

```
nohup sh target/appassembler/bin/IndexCollection -collection CarCollection -input /path/to/car17v2.0 \
-index lucene-index.car17v2.0.pos+docvectors+rawdocs -generator DefaultLuceneDocumentGenerator -threads 1 \
-storePositions -storeDocvectors -storeRaw >& log.car17v2.0.pos+docvectors+rawdocs &
-index indexes/lucene-index.car17v2.0.pos+docvectors+raw -generator DefaultLuceneDocumentGenerator -threads 1 \
-storePositions -storeDocvectors -storeRaw >& logs/log.car17v2.0.pos+docvectors+rawdocs &
```

The directory `/path/to/car17v2.0` should be the root directory of Complex Answer Retrieval (CAR) paragraph corpus (v2.0), which can be downloaded [here](http://trec-car.cs.unh.edu/datareleases/).
Expand All @@ -30,27 +30,27 @@ Specifically, this is the section-level passage retrieval task with automatic gr
After indexing has completed, you should be able to perform retrieval as follows:

```
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v2.0.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-bm25 -output run.car17v2.0.bm25.topics.car17v2.0.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v2.0.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-bm25 -rm3 -output run.car17v2.0.bm25+rm3.topics.car17v2.0.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v2.0.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-bm25 -axiom -axiom.deterministic -rerankCutoff 20 -output run.car17v2.0.bm25+ax.topics.car17v2.0.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v2.0.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-qld -output run.car17v2.0.ql.topics.car17v2.0.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v2.0.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-qld -rm3 -output run.car17v2.0.ql+rm3.topics.car17v2.0.benchmarkY1test.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.car17v2.0.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.car17v2.0.pos+docvectors+raw \
-topicreader Car -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt \
-qld -axiom -axiom.deterministic -rerankCutoff 20 -output run.car17v2.0.ql+ax.topics.car17v2.0.benchmarkY1test.txt &
```
Expand Down
6 changes: 3 additions & 3 deletions docs/regressions-clef06-fr.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ Typical indexing command:

```
nohup sh target/appassembler/bin/IndexCollection -collection JsonCollection -input /path/to/clef06-fr \
-index lucene-index.clef06-fr.pos+docvectors+rawdocs -generator DefaultLuceneDocumentGenerator -threads 16 \
-storePositions -storeDocvectors -storeRaw -language fr >& log.clef06-fr.pos+docvectors+rawdocs &
-index indexes/lucene-index.clef06-fr.pos+docvectors+raw -generator DefaultLuceneDocumentGenerator -threads 16 \
-storePositions -storeDocvectors -storeRaw -language fr >& logs/log.clef06-fr.pos+docvectors+rawdocs &
```

The collection comprises news articles from ATS (SDA) and Le Monde totaling 177,452 documents.
Expand All @@ -32,7 +32,7 @@ Topics and qrels are stored in [`src/main/resources/topics-and-qrels/`](../src/m
After indexing has completed, you should be able to perform retrieval as follows:

```
nohup target/appassembler/bin/SearchCollection -index lucene-index.clef06-fr.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.clef06-fr.pos+docvectors+raw \
-topicreader TsvString -topics src/main/resources/topics-and-qrels/topics.clef06fr.mono.fr.txt \
-language fr -bm25 -output run.clef06-fr.bm25.topics.clef06fr.mono.fr.txt &
```
Expand Down
16 changes: 8 additions & 8 deletions docs/regressions-core17.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ Typical indexing command:

```
nohup sh target/appassembler/bin/IndexCollection -collection NewYorkTimesCollection -input /path/to/core17 \
-index lucene-index.core17.pos+docvectors+rawdocs -generator DefaultLuceneDocumentGenerator -threads 16 \
-storePositions -storeDocvectors -storeRaw >& log.core17.pos+docvectors+rawdocs &
-index indexes/lucene-index.core17.pos+docvectors+raw -generator DefaultLuceneDocumentGenerator -threads 16 \
-storePositions -storeDocvectors -storeRaw >& logs/log.core17.pos+docvectors+rawdocs &
```

The directory `/path/to/nyt_corpus/` should be the root directory of the [New York Times Annotated Corpus](https://catalog.ldc.upenn.edu/LDC2008T19), i.e., `ls /path/to/nyt_corpus/`
Expand All @@ -29,27 +29,27 @@ Topics and qrels are stored in [`src/main/resources/topics-and-qrels/`](../src/m
After indexing has completed, you should be able to perform retrieval as follows:

```
nohup target/appassembler/bin/SearchCollection -index lucene-index.core17.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.core17.pos+docvectors+raw \
-topicreader Trec -topics src/main/resources/topics-and-qrels/topics.core17.txt \
-bm25 -output run.core17.bm25.topics.core17.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.core17.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.core17.pos+docvectors+raw \
-topicreader Trec -topics src/main/resources/topics-and-qrels/topics.core17.txt \
-bm25 -rm3 -output run.core17.bm25+rm3.topics.core17.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.core17.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.core17.pos+docvectors+raw \
-topicreader Trec -topics src/main/resources/topics-and-qrels/topics.core17.txt \
-bm25 -axiom -axiom.deterministic -rerankCutoff 20 -output run.core17.bm25+ax.topics.core17.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.core17.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.core17.pos+docvectors+raw \
-topicreader Trec -topics src/main/resources/topics-and-qrels/topics.core17.txt \
-qld -output run.core17.ql.topics.core17.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.core17.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.core17.pos+docvectors+raw \
-topicreader Trec -topics src/main/resources/topics-and-qrels/topics.core17.txt \
-qld -rm3 -output run.core17.ql+rm3.topics.core17.txt &
nohup target/appassembler/bin/SearchCollection -index lucene-index.core17.pos+docvectors+rawdocs \
nohup target/appassembler/bin/SearchCollection -index indexes/lucene-index.core17.pos+docvectors+raw \
-topicreader Trec -topics src/main/resources/topics-and-qrels/topics.core17.txt \
-qld -axiom -axiom.deterministic -rerankCutoff 20 -output run.core17.ql+ax.topics.core17.txt &
```
Expand Down
Loading

0 comments on commit cee4463

Please sign in to comment.