Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Bavarian ("Bairisch") Language #3033

Merged
merged 19 commits into from
May 11, 2023
Merged

Conversation

79Luca79
Copy link
Contributor

@79Luca79 79Luca79 commented May 4, 2023

ANTHONY ROWLEY (Professor of Germanistik at the University of Munich):"This charter was actually intended for minority languages: This applies, for example, to the Danes in Schleswig-Holstein or the Sorbs in Saxon and Anhalt - and not to dialects per se. The thing is, however, that at a conference of the northern German Länder, Low German was protected as a regional language. The Association for the Promotion of Low German here in Bavaria then considered whether this should not also be claimed for Bavarian as a cultural language, if this is what is done with Low German. I actually have the following opinion on this: I didn't think it was right to include Low German in this Charter: At first, I didn't think it was right to protect Low German, because it is just as much a dialect as Bavarian. But if Low German is already included, then I am very much of the opinion that Bavarian is also a regional language worthy of protection. However, it is precisely not variants of the standard language that should be protected. I have colleagues who say emphatically that Bavarian is German: for German is, if you like, an empty barrel into which many dialects belong, such as Bavarian, Franconian, Swabian, Low German, and even the standardised New High German is a dialect of German, i.e. one dialect among others. This standard language has a certain social status, but it is nothing better than the others."

FULL Interview in Text:
https://web.archive.org/web/20080112084848/http://www.br-online.de/alpha/forum/vor9911/19991105_i.shtml

Why add it:

  1. According to ISO 639-3 it is recognized as its own Language as {bar}.
  2. The Linguistic Distance between Swedish-Norwegish-Danish;Serbian-Croatian;Czech-Slovakian is greater than the Difference between Bavarian and Standardgerman. (Standardgerman is supported in OA as "German", but Bavarian is not a Dialect of Standardgerman!) Therefore it can not be a simple Dialect because the Linguistic Distance is greater and Bavarian is older than Standardgerman!.
  3. In South Tyrol, Austria and Switzerland Dialects of Bavarian and Allemanic are spoken since the Migration Period and South Tyrolean German, Austrian German and Swiss German officially used in School and any written Form. These are Languages Variations of German which are EQUAL to the Standardgerman of Germany, have their own Rules and Words. I list this here so People know that "German" written and spoken in Germany is not the only correct way to spell and write a Standardgermanvariation. But i do not care about these 4 Standards, because i consider them to be foreign Languages to me.
  4. There exist already many Websites, Books and Text which investigate and document Bavarian and its Dialects, namely:
  1. Bavarian is recognized by Glottolog, Ethnologue and as an Endangered Language. (thats why im here :) )
  2. IT HAS RULES that are already described in great Detail over 100 Books, 60 of them are listed on Glottolog here: https://glottolog.org/resource/languoid/id/bava1246

The more Interdisciplinary and everyday Stuff:

  1. In Southtyrol when we write a SMS or Whatsapp, a simple Note or speak on the Phone we do never ever talk or write in "German", we do use exclusivly a Dialect of Bavarian or to make it simple Bavarian. I think this applies to the other Tyrol, Austria and Bavaria aswell.
  2. This brings me to my next Issue: Today "German" means in everydayspeak by the average Person "Standardgerman". The German Tourists talk German it is said, which to be correctly is now equivalent to "German". So if German today means Standardgerman and Bavarian is older than Standardgerman than Bavarian can not be shoved under the German Tab or be a Dialect of German=Standardgerman. The German Tourists are called the Germans, Swiss the Swiss and Austrians the Austrians; therefore no Germafeeling exists outside Germany anymore supported by Ernst Bruckmüller (Austrian Selfidentification) and (https://www.provinz.bz.it/familie-soziales-gemeinschaft/integration/images/899470_sprachbarometer_2014.pdf) So where no more Germans are, they can not be German. Identity=Language. Every European Country speaks a Language that is called after the Countrys Name, but the "German Lands" (obsolete since 1530 1870, 1919 and 1945 depends on Definition) speak all German? This is not how it works. ( According to official studies by ASTAT in "Sprachbarometer 2014" Southtyroleans were asked how they feel: 80% of "Natives" (non-Ladinians or Italians) feel South Tyrolean, 4% austrian and 4%german.)
  3. The vast majority of germans who speak standardgerman do not understand my written language, speech or words. Only the Bavarians, Swiss and Austrians do but they speak Bairisch and Swiss.
  4. Putting South Tyrolean or Bavarian under German instead of its own Language is a big Identity Issue for the South Tyrolean Natives, Austrians and "Swissspeaking Swiss" because they today live in a Romance speaking State, italy, as a minority; do not share the same culture and mentality as the Germans (South tyroleans are much closer to the bairisch speaking Bavarians and Austrians and Allemannic speaking Swiss and Rhaetoromance speakers like the Ladins and eqiivalent in Switzerland:Many words and place Names and speech patterns still exist from Latin times in Zillertal and Vinschgau; and in Austria even slavic Familynames, Placenames, Everydaywords...ladin equivalents in Southtyrol, Eastswitzerland and Tyrol!)
  5. (I dont mean this to seriously, these Times happened in the Hitler Days and before; much less today but still a small Issue.) Calling Bavarian a Dialect of Standardgerman is an (unconscious) imperialistic instrument to claim ownership of Lands and Language Authority by the Germans. (Not my Words but Dr. Stefan Dollinger)
  6. South Tyrolean or all the other Bavarian Dialect and Bavarian itself were never called "southtyrolean Deutsch" or "bairischdeutsch". See Dutch, its just Deutsch with a missing Letter but is its own Language with Dialects.
    How can it then be that Dutch is not under Deutsch/German, even if its clearly the Same Word with a missing Letter?
    7.The Backlabelling of putting German/deutsch on the Kaiser of Germany ("Ludwig der Deutsche") and its Population Groups like the Baiiuvari/Bavarians is not correct. They never saw themselves as German or that a alloverarching Germaness is over their People or Land. Also Germanic People were never one Ethnic Group.
    8.How can Austrians, Swiss and South Tyroleans be Germans and speak German (Standardgerman) when they live in different Nations, Cultures, Mentalites and have different Genes?
    9.A German now means only and exclusivly a Citizen of the Federal Republic of Germany. There is no German Culture or Language ouside of it. (Expect maybe the Hutterites see themselves still as German but in the Sense of 1800!)
    10.Dutch was by a Prussian Linguist put under the German Branch in 1905, today no one would dare this and the Hate would be immense. It is just the same Thing today with Bavarian and its Dialects,

As a Minority Person and Speaker (South Tyrolean in Italy, Bavarian Language Speaker which is endangered both nationally and internationally.) this is obviously a very important Issue, ;)

@79Luca79 79Luca79 changed the title Bairisch Add Bavarian ("Bairisch") Language May 4, 2023
@github-actions
Copy link

github-actions bot commented May 4, 2023

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

@Logophoman
Copy link
Contributor

I think it's best to just have one of them active - the one with the most recent updates should be active the other one closed in my opinion.

@79Luca79
Copy link
Contributor Author

79Luca79 commented May 4, 2023

Thank you @andreaskoepf for closing my obsolete Pull-request and for labeling this PR!

(From other closed and now obsolete PR)

The likelihood of merging German dialects like Swabian or Bavarian is currently IMO very low unless someone really adds special support for language dialects (e.g. to ensure they don't show up during primary language selection). The only dialect or language-variation we currently support is Brazilian Portuguese (>200M speakers).

Given the fact that we haven't yet collected a meaningful number of messages for "Hochdeutsch" I would in general question this further fragmentation.

There are many different language variations like British and American English or Swiss, Austrian, Bavarian, Saxon, Plattdeutsch, Kölsch German. Within many countries there are dialects spoken by groups in certain areas. I think it would be great to support these if the choices for these sub-languages are made in a non-obtrusive way.

The likelihood of merging German dialects like Swabian or Bavarian is currently IMO very low unless someone really adds special support for language dialects (e.g. to ensure they don't show up during primary language selection). The only dialect or language-variation we currently support is Brazilian Portuguese (>200M speakers).

Given the fact that we haven't yet collected a meaningful number of messages for "Hochdeutsch" I would in general question this further fragmentation.

There are many different language variations like British and American English or Swiss, Austrian, Bavarian, Saxon, Plattdeutsch, Kölsch German. Within many countries there are dialects spoken by groups in certain areas. I think it would be great to support these if the choices for these sub-languages are made in a non-obtrusive way.

Bavarian, or in my Case South Tyrolean/Tyrolean, is seen by my People here not a Dialect of German. For example if i ask Tyroleans:"What Language do the Germans speak?" They say German (but mean standarisierte neuhochdeutsche Schriftsprache, they just dont read up on it). I then ask, what do we speak? German, they say. After i tell them that if we both speak German but the "Germans" dont understand us, then they say oh then we speak Tyrolean. (I choose Bairisch here because my friends and i can read it just fine, Tyrolean has only ~1 Million Speakers and this would for now be too fragmenting of the Project.) I think adding Bairisch is fine, since i know a few People who would like to contribute and/or chat. Some of them are not interested in chatting in "German", but when i tell them they can chat with it in their Language as they already do on SMS and WhatsApp with all Family and Friends (everyone, not just Young People do, all Grandparents here write exclusivly in Pustrarisch/Vinschgerisch/Eisacktalerisch, i simply call it and use "Bairisch" for now here.)

Why should Bairisch not show up in Primary Language Selection? If it gets hidden under German some People wont find it, just use German anyway or use English because they think its again just a Dialect and that they contribute to German.
I think seperating Bairisch from German and giving it its own Tab like Catalan is REALLY Important. I see it as a first small Step of bringing Bairisch and its Dialects/Variations/whatever outside the usual SMS/whatapp Space out into the more everyday Usage of People. I would not write and do all of this Effort if i didnt believe in the transforming Effects that OA.
About Brazilian Portuguese: From my Recherches it seems People really call it this way and not Brazilian. If it really is this way that Portuguese can not understand B-P well or at all then i feel it would be a good Idea that its called Brazilian. (Just a small thought, they themselves need to have the Will first.) The Name of Language is also a good Indication if its Speakers want to have it called American English or B-P. Would they want to really do their own thing they would call it American for example. Bairisch was never called Bavarian German and is in this case a Language that wants to be seperate. (But its not always the Case! See Kölsch. No Deutsch in it but accoring to Wikipedia supposedly they dont even knew they speak their own seperate Language??)

Anyway on the Issue of Fragmenting German: i dont think its a big issue because many Germans speak today "german" soley and notjing else. So there is no will anyway to add a Dialect only their Opas speak today. Its again wastly different in Bairisch.

Fragmentation: As far as i know i did not read any effort of the Aussies, Americans and Canadians, Saxn or any other Dialect Minority to consider their Dialects. That Swabian (Dialect of Allemanic Language) and Bavarian (Language) were the First to even make the Website translation is a very clear Sign that these People are much more connected to their Language than any other Minority in germany. If we now say that only an "official" Language like Platt/Low german is eligble for having its own language tab then the issue is: where are all of them and why dont they translate the website?
The reasons i think are: too few Speakers,too pessimistic outlook o even start and most speak Hochdeutsch and do not consider their Platt to be important enough or different to start. (But of course, some still speak and love it and some will come and start the translation and contribution! :) )

Also If we look into Wikipedia we can guess which People mean it seriously and think their Languages are Languages and not just Dialects by simply looking which Wikis exist: Allemanic, Bavarian but no American English or B-P.
Platt exists but does not mention that its a Language.

But We have even Dr. Anthony Rowley that says Bairisch could be seen as its own Language. Idk of any Scientists that say that openly about the smaller Languages/Dialects?

Also we can not group Bairsch under German because it is spoken in Austria and Italy aswell. These People when asked:"are you german?" They say no. Do you speak German, they say also mostly no.
So this again differentiates Bairisch from most of the other Dialect of German: They are spoken in Nations where People identified as Deutsch 200/100 years (depends how you define) but today they do not anymore.
Also German is Hochdeutsch but it is not called this Way on the Internet or OA. As we know Bairisch is not a Dialect of Hochdeutsch/Standardisierte neuhochdeutsche schriftsprache but Oberdeutsch. So there is no way that Bairisch can sit under the Tab German.

On too few Speakers: Bairisch has 13 Million L1 Speakers. OA already has Bokmal, Catalan, Serbian soon, Swedish, Danish. All Languages with fewer Speakers, so i see no Issue that we could not find enough People. If they are here, why cant Bairisch be here?

We need to stop looking at Nations and assigning new Languages due to Political reasons willi nilli.
As we know form Wikipedia (so sourced):"the difference between (standard)german and bavarian is greater than the difference Swedish, Norge, Danish; Serb,Croat,Slowenian." And i bet Ukrainian-Russian too.

:)

For German dialects I would like to propose a compromise: Let's allow questions in an (understandable) dialect or slang in the German Dataset, but they must always be answered in Standard German. This would make the dataset more robust against all kinds of normal behavior by German users like using "nen" instead of "einen" and so on. It would also be a good policy to avoid having this discussion again and again for every dialect.

@Logophoman If they just use nen instead of einen than there will be a high probability that that these users wont decide to even think about if their Speech can be its own Language. I think we are a bit too worried about fragmentation of the german dataset. For this to happen you need a really different set of words/language and grammar for most people to even detect that the Speakers of the Standardlanguage dont understand their Speech or Text.

I dont have much expirience in Swabian, so far i have not seen videos where people speak it so i ignorantly thought it was just a mild "dialect" of "german! I understand "swiss" fine and think its clearly not german.
But! After i read your Translation i really think now that yes its very different from Standardgerman and these nativespeakers will have significant Problems understanding it by listening or reading it. They will get it from reading, but that is the case too in norwegian and danish (a case that is in an issje or PR here, so im not making up stuff).

:)

(This Wall of Text i think clearly shows that this is a serious Issue of the Minorities.)

EDIT: I would really like to again contribute and chat with the Bot, but in my own Language(or Language Group)! I contributed to English and German a little, but soon heard that chatting with it in Bavarian (yes, i know conversation qjality will be low in the beginning) could be possible. So now im just waiting fo start. :)

i corrected translations
@github-actions
Copy link

github-actions bot commented May 6, 2023

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

added translations for Word-Titles. I have left the information text untouched as it must be clearly and easily understandable.
@github-actions
Copy link

github-actions bot commented May 6, 2023

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

i changed this to make it more consistent and the "violent" issue is gone.
I translated everything. There is now nothing more to do in this Pull-Request, it is ready to be merged in my opinion.
@Logophoman
Copy link
Contributor

@79Luca79 I personally agree that adding these dialects to the main Language list wouldn't be a real problem as well, maybe when adding a bunch of Languages we could make the Dropdown searchable? @andreaskoepf I could make the necessary changes if that is a functionality we want. Alternatively I could work on a dropdown menu for the languages with a dialect mapping component that will only be rendered conditionally when there is actual dialects that belong to a language group...

This could Look like follows:

image

And I would simply add Dialects in here, where Bairisch etc. could be added:

image

If this PR is wanted, I'll need to make a few changes to get it to work correctly since I need to change the way routes are built in the LanguageSelector Component, but I think it's doable (will probably take me 2 weeks till PR)...

Which way do you think we should proceed @andreaskoepf? Should I make a PR with a search component for the main languages - I personally think this will offset the Usability issues posed by a long language list - or do you want to have a conditionally rendered dialogue dropdown?

@79Luca79
Copy link
Contributor Author

79Luca79 commented May 8, 2023

@Logophoman 2 Weeks is long as Waittime.
I think it would be better to simply add Bairisch to the Liste we have now. It is only 1 Entry more.
-In the near Future what Dialects will come? Serbian is waiting to be merged, which will certainly not be called Serbo-Croatian even if it is a Dialectcontinuum. (the War.)
-The Spanish will make Efforts to get Catalan under Spanish; the Italians want Ladin under its Language, Sardinian, Sicilian and more. (italian is also the standardized toscan dialect made popular by Dante, so the same "mess" as in german.)

If we do it this way where do we stop with the Categories? A Scandinavian, Serbocroat, italian, spanish, russian Dialect entry.
We should first add the Language seperately, its simple and if enough People make new Translations for new Languages/Dialects then i feel "we can discuss again".

If we take a look at the Languages Wikipedia supports (as a guide to judge if People are so involved into their Dialect to consider making their own Wikipedia version) we see most European Languages are just the standard.
Under the different Languages like Czech, Slovak, English, French, Italian we can find short Links to Dialects (not even all have links!) or they are even just called Varities.

If these People are really speaking so differently like the Bavarians, Platt, Swabians then they would already have their own translated Wikipedia Versions.

I think we should wait with deploying the categorizing.

And i am honestly restless, i just want to help with labelling and making OA better.
(No worries Reviewers, i dont mean you are slow! You do it voluntarely. :) )

Maybe im more sensitive to the Categoration than most People (im a Minority) or maybe overthinking it.
(you asked @andreaskoepf, but i just wanted to give another angle)

and changed some words=minor diff
bar   added. Bairisch.
Copy link
Collaborator

@andreaskoepf andreaskoepf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll can try if "bar" works on dev ... (theoretically it should)

@andreaskoepf andreaskoepf merged commit 06292d7 into LAION-AI:main May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants