Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work for merging Kaiping-2018-591 #467

Merged
merged 14 commits into from
Apr 23, 2018
Merged

Conversation

Anaphory
Copy link
Contributor

That source is available through http://www.model-ling.eu/lexirumah/, with individual concepts of the form http://www.model-ling.eu/lexirumah/parameters/angry where angry is our ID for that concept.

The concept list of the source in cldf is maintained as https://github.com/lessersunda/lexirumah-data/blob/with_lexi_data/cldf/concepts.csv, and should be easy to get into the format for https://github.com/clld/concepticon-data/tree/master/concepticondata/conceptlists once the concepts submitted here are linked in there.

@Anaphory
Copy link
Contributor Author

Oops, sorry for messing up the concept lists table by accidentally removing the quotation marks from all descriptions that contain commas.

@Anaphory
Copy link
Contributor Author

ERROR:conceptrelations:522: invalid TARGET_GLOSS: RACK (FOR STORING FOOD)

Would KITCHEN RACK be valid?

ERROR:conceptlists.tsv:221: invalid bibtex record: Kaiping2018

How would you like it?

@Anaphory
Copy link
Contributor Author

Anaphory commented Apr 17, 2018

@LinguList, @chrzyki, and @tresoldi – anything else?

@tresoldi
Copy link
Contributor

I think it is better for @LinguList to comment on new concepts (is this urgent? I mean, for publication?). They seem all reasonable to me and you did a great work on concept relations, but again I'm the one who always wants to add more and more concepts...

Your conceptlists has problems with the escaping of quotes. It is the same problem @MacyL had when using LibreOffice Calc, and I am in favor of fixing it in master -- we don't really need those quoted texts, given the delimiters are tabs and not commas. But for the time being you should keep it as it is.

You also need to add Kaiping2018 to the bibtex references (it is only in the concepts lists) and take a look at the automatic test failure at line 522 in concept relations, with SHELF > RACK (FOR STORING FOOD) . I can't see what is the problem here, maybe there is a typo I am missing?

@Anaphory
Copy link
Contributor Author

Anaphory commented Apr 17, 2018

Nope, it's not particularly urgent, he just said I should assign all three of you as reviewers, and given that I'm not an admin of this repo and cannot do that, I thought tagging you would be the next best thing.

[Problems]

I noticed those, see comments above. (Yes, I was using localc, too, and I did not check my commit diffs well enough.)

@Anaphory
Copy link
Contributor Author

Oh, how much do you like getting concept_set_metadata? I could possibly set someone here to build some for you at least for our new stuff, if you have a strong interest in expanding them and no-one to do it.

@tresoldi
Copy link
Contributor

Personally, I think I'd like it, and suppose @LinguList would too. Just pinging him here, he'll probably be able to answer next week.

@tresoldi
Copy link
Contributor

I forgot to mention: we were actually discussing this last Friday, when I started looking at the automatic WordNet mapping already there (our specific problem are the various Chinese verbs equally glossed as "to cut"). One of the ideas was that we might focus first on mapping to BabelNet, in order to 1. reuse other mappings the project already does (Wordnet itself, Wiktionary, etc.) and 2. refer to a more open catalogue (more likely to accept our changes).

@Anaphory
Copy link
Contributor Author

Excellent. Once we have sorted out this collection of concepts I'm submitting here (and whether some of them need changing), and once you have a good picture of what kinds of connections have priority for you, I'll try to set one of our student assistants to that task if possible.

@tresoldi
Copy link
Contributor

Really great!

And hopefully s/he will be able to fix errors such as "my/mine" mapped to "a small train station in Belgium". ;)

@Anaphory
Copy link
Contributor Author

I'd set them up to deal with our particular concepts first, and with checking the other 580-or-so concepts we link to afterwards, so I guess potential Belgian train stations would come very late – possesion works differently in Eastern Indonesian languages ;-P

@tresoldi
Copy link
Contributor

There is now a conflict, but it is just a false positive (two new bibliographies being added at the same time). We can merge later when @LinguList has reviewed. :)

@Anaphory
Copy link
Contributor Author

I needed to merge it to have the Travis tests run on this.

@Anaphory
Copy link
Contributor Author

Oof, I found the culprit for the other error. RACK (FOR STORING FOOD) had a space after it in concepticon.tsv, but not in conceptrelations.tsv. Where would you suggest I should add a check that gloss.strip() == gloss?

@codecov-io
Copy link

codecov-io commented Apr 19, 2018

Codecov Report

Merging #467 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@          Coverage Diff           @@
##           master    #467   +/-   ##
======================================
  Coverage    88.1%   88.1%           
======================================
  Files           6       6           
  Lines         950     950           
======================================
  Hits          837     837           
  Misses        113     113

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8873199...43724ce. Read the comment docs.

@tresoldi
Copy link
Contributor

tresoldi commented Apr 19, 2018

No idea, @xrotwang or @LinguList would be the ones to decide here. I actually think that it is good that the test fails (the glosses are different, after all), maybe we'd better invest in a "linter" or in a more informative reporting (even printing the culprit inside a list, [gloss], would help).

In any case, my opinion is that the PR is good for merging, let's wait for Mattis' opinion on the new concepts. Good work there, I know how hard it can be to fix every missing detail! :)

@Anaphory
Copy link
Contributor Author

It's perfectly fine that the test fails, and actually the test was informative enough if you actually read what it says instead of assuming things. I'm suggesting that some new test should fail if a gloss starts or ends with whitespace, because those are errors that are seriously hard to track down, in particular in TSV.

@tresoldi
Copy link
Contributor

tresoldi commented Apr 19, 2018 via email

3166 TWENTY TWO Quantity The cardinal number represented in Roman numerals as XXII, and in Arabic numerals as 22. Other
3167 SERMON Religion and belief A religious lecture or speech, addressing issues of religion or morality. Person/Thing
3168 DOWRY Social and political relations Property, gifts or money transfered from the parents of a daughter to her new family at marriage. Person/Thing
3169 TREATY Social and political relations Formal agreement between nations, states or other international bodies. Person/Thing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please double check, we may have something similar, like "contract" already there. I am sure I read this or similar already in the whole concepticon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking for ‘contract’ and not finding one, I'll have another look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked various synonyms again, the only thing remotely in that direction I could find is PLOT with a strange definition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, thanks, I also checked with http://calc.digling.org/concepticon/ and you're right, so this is a good new concept.

3167 SERMON Religion and belief A religious lecture or speech, addressing issues of religion or morality. Person/Thing
3168 DOWRY Social and political relations Property, gifts or money transfered from the parents of a daughter to her new family at marriage. Person/Thing
3169 TREATY Social and political relations Formal agreement between nations, states or other international bodies. Person/Thing
3170 BATHE SOMEONE The body Clean someone else with water. Action/Process
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BATHE (SOMEONE) is a better way to gloss this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I had that. Will fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general rule: separate main verb and put objects and things in brackets. It's easier for later automatic matching.

3171 RINGWORM The body Dermatophytosis, fungal skin infection making round patches on the skin. Person/Thing
3172 SCABIES The body A contagious skin rash caused by the mite Sarcoptes scabiei. Person/Thing
3173 TINEA The body Dermatophytosis, fungal skin infection making white spots/non-symmetrical white patches on the skin. Person/Thing
3174 WAKE SOMEONE UP The body Rouse from sleep. Action/Process
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have wake up already, make sure to check that this is different, also look at reflexes. If we use it in intransitive form, add WAKE UP (SOMEONE OR ONESELF) as over-arching concept, and WAKE UP (SOMEONE) vs. WAKE UP (ONESELF) [or similar, better gloss ideas welcome] plus a relation (compare triple BURN, BURNING, BURN (SOMETHING)

Copy link
Contributor Author

@Anaphory Anaphory Apr 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not able to check the counterparts semantically, but we have
kasih bangun (seseorang),Like 'membangunkan' (but not with the sense of 'to erect a building') for the elicitation – the seseorang is the object somebody, and mem-…-kan (or something like that, my Indonesian is still negligible) is a transitivizing circumfix.

3172 SCABIES The body A contagious skin rash caused by the mite Sarcoptes scabiei. Person/Thing
3173 TINEA The body Dermatophytosis, fungal skin infection making white spots/non-symmetrical white patches on the skin. Person/Thing
3174 WAKE SOMEONE UP The body Rouse from sleep. Action/Process
3175 MOUND OF STONES FOR RITUAL ACTIVITIES The house A mound of stones or altar around/on which ritual activities are held. Person/Thing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this concept THAT important tha tyou want to add it? maybe just leave it unlinked for the time being?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how it's more strange than any of the other concepts we have, but sure, I can leave it unlinked. It's attested in about a quarter of our lects, and often with apparently mono- or bi-morphemic words (http://www.model-ling.eu/lexirumah/parameters/ritual_mound#8/-9.215/124.899), but we also have it in our cultural questionnaires with probably more explanation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to a case we had above: if you have a cultural term (compare it with "mid-autumn festival") it is fine for me, but as a concrete cultural thing in one region, the gloss sounds too abstract.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I get it. I'll have a look for a better gloss and maybe definition.

3174 WAKE SOMEONE UP The body Rouse from sleep. Action/Process
3175 MOUND OF STONES FOR RITUAL ACTIVITIES The house A mound of stones or altar around/on which ritual activities are held. Person/Thing
3176 RACK (FOR STORING FOOD) The house Bamboo shelves in the kitchen to store food and utensils. Person/Thing
3177 RAISED PLATFORM The house An elevated wooden or bamboo platform on which people can sit, squat or sleep. Person/Thing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the best way to gloss this? Seems to be something specific, so maybe use a term they use in the culture? Or decide to leave unlinked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can gloss this as BALE-BALE.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then I'd suggest: BALE-BALE (RAISED PLATFORM IN HOUSE)

3178 RITUAL GROUND The house An uncovered open area used for ritual activities and public meetings. Person/Thing
3179 CORAL ROCK The physical world Limestone created from the skeleton of reef-forming stony corals (Scleractinia), displaying their growth patterns. Person/Thing
3180 SHOOT WITH SLINGSHOT Warfare and hunting Use a rubber-based projectile weapon to propel projectiles against a target. Action/Process
3181 PLANT RICE Agriculture and vegetation To put a rice seedling in the wet ground of a paddy so that it strikes root and grows. Action/Process
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have several forms of PLANT. Is it REALLY "PLANT (RICE)" ? or just plant something, and rice is a dummy term? be careful with this and double-check, most sino-tibetan datasets have also plant (rice), but there, rice is a dummy term that could be replaced by anything, and is only elicited to make it easier to elicit the word. (like "eat" vs. "eat rice" == eat). and write PLANT (RICE)

3179 CORAL ROCK The physical world Limestone created from the skeleton of reef-forming stony corals (Scleractinia), displaying their growth patterns. Person/Thing
3180 SHOOT WITH SLINGSHOT Warfare and hunting Use a rubber-based projectile weapon to propel projectiles against a target. Action/Process
3181 PLANT RICE Agriculture and vegetation To put a rice seedling in the wet ground of a paddy so that it strikes root and grows. Action/Process
3182 PLANT YAM Agriculture and vegetation To put a yam seed tuber into ridged-up ground so that it strikes root and grows. Action/Process
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to be a bit too specific, maybe leave unlinked (as the "plant rice"). At least, write "PLANT (YAM)".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as yam and rice go: Yes, ‘rice’ and ‘yam’ are in brackets in our Indonesian elicitation, but looking at the counterparts, there is quite a bit of variety between the two actions. A good chunk of languages co-lexify them, but there are also several with short, distinct words for the processes of (I assume) sticking a rice seedling into a flooded field, vs. dibbling a hole into ridged-up soil and putting seed yams in there.

So, from my abstract perspective, this is just a disambiguation like every other one, although I should probably check which parts of that process are exactly implied by our elicitation process and leave them unlinked until then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. Can you find a way to elicit this in the concepticon gloss or making the definition broader? If we add these, we'd have: plant (something), plant (rice), plant (yam) ?

3180 SHOOT WITH SLINGSHOT Warfare and hunting Use a rubber-based projectile weapon to propel projectiles against a target. Action/Process
3181 PLANT RICE Agriculture and vegetation To put a rice seedling in the wet ground of a paddy so that it strikes root and grows. Action/Process
3182 PLANT YAM Agriculture and vegetation To put a yam seed tuber into ridged-up ground so that it strikes root and grows. Action/Process
3183 SARONG OF WOMAN Clothing and grooming The female style of a large tube or length of fabric, often wrapped around the waist, worn throughout much of South Asia. Person/Thing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we had some woman clothing sounding similar, maybe check our reflexes. If not write SARONG (OF WOMAN)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did I mess up the brackets nearly every single time?

Obviously, you do have SARONG, but as far as I'm told, there are languages that have different terms for SARONG (OF MAN) and SARONG (OF WOMAN). Given that we have only one of these, I'd be fine with linking this to SARONG and making the disambiguation part of the notes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think, as it would otherwise require to set up SARONG -> SARONG (OF WOMAN), etc., which is then also a possibility, even if SARONG (OF MAN) is not (yet) there

3181 PLANT RICE Agriculture and vegetation To put a rice seedling in the wet ground of a paddy so that it strikes root and grows. Action/Process
3182 PLANT YAM Agriculture and vegetation To put a yam seed tuber into ridged-up ground so that it strikes root and grows. Action/Process
3183 SARONG OF WOMAN Clothing and grooming The female style of a large tube or length of fabric, often wrapped around the waist, worn throughout much of South Asia. Person/Thing
3184 TRADITIONAL HOUSE The house A dwelling built in a vernacular or ethnic architectural style. Person/Thing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not clear what that means, too fuzzy. probably a special term in your list, so please give a more concrete gloss and explanation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually a fuzzy term, I think, because the ethnic architectures may differ and I think one of the ideas was that this item might be useful in cross-linking lexicon and culture. I guess that may be a good reason to leave it unlinked.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, let's leave it unlinked for the time being.

@@ -509,3 +509,36 @@ SOURCE SOURCE_GLOSS RELATION TARGET TARGET_GLOSS
2693 MATERNAL AUNT (MOTHER'S SISTER) narrower 3042 MATERNAL AUNT (MOTHER'S OLDER SISTER)
2693 MATERNAL AUNT (MOTHER'S SISTER) narrower 3043 MATERNAL AUNT (MOTHER'S YOUNGER SISTER)
932 BARLEY narrower 3079 HIGHLAND BARLEY
51 SORE narrower 3171 RINGWORM
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be careful, we have exemplars of a category, so don't link them in this way...

2982 GIFT narrower 3168 DOWRY
3024 BRIDE PRICE similar 3168 DOWRY
3110 CLAP similar 3154 SLAP
3166 TWENTY TWO similar 3166 TWENTY TWO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gues we don't need this line right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of them was supposed to be TWENTY THREE, is that useful? Probably not, now that I understand what the relations are for. I'll remove it.

2584 BENCH similar 3177 RAISED PLATFORM
2982 GIFT narrower 3168 DOWRY
3024 BRIDE PRICE similar 3168 DOWRY
3110 CLAP similar 3154 SLAP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x is similar x?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, SLAP is a specification of HIT, similar to SLAP which makes a similar sond, but is not hitting someone else, but your own hand. We actually checked our elicitation procedure and your definitions for that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I actually read "slap" vs. "slap" both times ...

volume = {5},
number = {3},
year = {2018},
volume = 5,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please leave brackets, if possible...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. I don't remember touching those lines, and I edited them with a text editor so I should know, but it looks like I still messed up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I did it in a merge commit, at that. That's like doubly evil. I'll see whether I can make it undone, otherwise I'll just fix it.

Copy link
Contributor

@LinguList LinguList left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quite a few things to discuss. I'll also have to check the linkings. Generally good work, but not ready to merge yet.

@tresoldi
Copy link
Contributor

tresoldi commented Apr 19, 2018 via email

@LinguList
Copy link
Contributor

Yes, @tresoldi, a general problem, especially the more concepts we get. It can't be only a few people who remember what concepts have been linked or not linked. No idea how to tackle this now, however...

@tresoldi
Copy link
Contributor

It is one more thing for the mapping methods which are not fully developed and, in particular, not used to their full extent. Let's get back to this in time, I'll open an issue on that for the time being (one more thing to tackle when we decide to go over all the issues).

@Anaphory
Copy link
Contributor Author

Anaphory commented Apr 20, 2018

The things still under discussion were:

3175 MOUND OF STONES FOR RITUAL ACTIVITIES The house A mound of stones or altar around/on which ritual activities are held. Person/Thing
3181 PLANT RICE Agriculture and vegetation To put a rice seedling in the wet ground of a paddy so that it strikes root and grows. Action/Process
3182 PLANT YAM Agriculture and vegetation To put a yam seed tuber into ridged-up ground so that it strikes root and grows. Action/Process

Everything else should be fixed or will be in a few commits.

@Anaphory
Copy link
Contributor Author

For the further procedure: It may take a month or two to check those problematic things in detail, because most of our group is preparing to be in Indonesia at least for most of May.

Would you prefer to have the problematic concepts taken out for now and be left unlinked, and to close this PR with that caveat; or would you prefer to leave this PR open while those issues are unresolved, and only merge all LexiRumah things in one go later?

@LinguList
Copy link
Contributor

I'd say: why not putting them at the side now and later updating the list? You'll probably also have different releases of lexirumah, so we'd already have most of it covered, and then you could update things later. So we keep your great work with the positive feeling of having added another concept list, etc.

@Anaphory
Copy link
Contributor Author

Done, and https://github.com/lessersunda/lexirumah-data/blob/with_lexi_data/cldf/concepts.csv should be in a better (I probably missed a thing or two, but it should be decent overall) state now, too.

Copy link
Contributor

@LinguList LinguList left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, anybody having the rights, feel free to merge if you agree, I am fine with this. Thanks @Anaphory

@tresoldi tresoldi merged commit a51f8de into concepticon:master Apr 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants