Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new "dasl_tapered" volume control profile #1699

Merged
merged 4 commits into from
Jul 17, 2023

Conversation

dasl-
Copy link

@dasl- dasl- commented Jul 2, 2023

Although I'm not an expert, to my ears, this logarithmic mode of volume control does a better job of making the volume percentage match the perceived volume level. That is, with logarithmic mode, 50% volume sounds half as loud as 100% volume. And 25% volume sounds a quarter as loud as 100% volume, etc.

Dealing with airplay protocol, we have a volume between [-30, 0]. So we want -15 to sound half as loud as 0. And we want -22.5 to sound a quarter as loud as 0. Etc.

I found this page helpful for my understanding: http://www.sengpielaudio.com/calculator-levelchange.htm

FWIW I use the builtin raspberry pi soundcard. I'm not sure if perceptual loudness maps differently when using a different soundcard.

Also, no worries if you don't want to merge additional volume_control_profiles. I could maintain a fork if you're not into the idea.

@mikebrady
Copy link
Owner

Thanks for the interesting ideas here and for the interesting reference.

Did you get a chance to look at https://tangentsoft.com/audio/atten.html, which is where the ideas behind the standard transfer function came from originally?

Also, forgive me -- maybe I'm missing something -- but doesn't the "flat" volume control profile not do what you are proposing here, i.e. spread the logarithmic attenuation range evenly over the -30.00 to 0 range of the AirPlay volume?

@dasl-
Copy link
Author

dasl- commented Jul 8, 2023

Did you get a chance to look at https://tangentsoft.com/audio/atten.html, which is where the ideas behind the standard transfer function came from originally?

Thanks for the reminder, I had not looked at that yet. The text in the "Attenuation Curve" section was particularly interesting. Quoting from the article below (also - apologies for the length of my 😅 reply)

Response to article

Our ears respond to sound on an exponential scale. A sound has to be 10 times as powerful for us to hear it as twice as loud. For a 4× increase in volume, it must be 100 times as powerful. This kind of increase gives a curve like this: ...

This seems to state that if we want volume percentage to map to perceived loudness percentage, we should use logarithmic attenuation. This is why I implemented logarithmic attenuation. The article goes on to explain some practical reasons why attenuators don't actually have true log curves:

Oddly enough, though, commercial volume controls don’t have true log curves. (I’ve yet to see one that does, at any rate.)

The first major reason for this is that it’s harder to manufacture a pot with a varying taper than one with a linear taper. So, many of the less expensive “log” pots are made from two linear segments joined together to approximate a log curve.

So it seems that physical limitations are why they don't typically make attenuators with true log curves. This physical limitation doesn't apply to software obviously. Software could attenuate more "ideally" (i.e. mapping volume % to perceived loudness %). From what I can tell, your default attenuator function uses three linear segments joined together to approximate a log curve: https://github.com/mikebrady/shairport-sync/blob/master/documents/Shairport%20Volume%20Control%20Transfer%20Function.pdf

The other major facet of this issue has to do with the way volume controls are typically used. Audio systems are designed so that the volume control is set near the top of their range most of the time in common use. Since you want fine control in the range you use the volume control most often, the control is designed to have fine attenuation rates at the top end of the range. But, because our ears are so sensitive to soft sounds, you wouldn’t want fine control across the entire range. People want to be able to turn the volume all the way down and have the sound practically muted, even though an attenuator is not a mute control. The solution is to increase the attenuation rate towards the end of the control range, so that the control’s total attenuation is very high.

Personally, I don't want finer control at the top of volume range. I'd prefer that a 5% reduction in volume level (e.g. 100% -> 95%, or 50% -> 45%, etc) should always result in a 5% reduction in perceived loudness, no matter if we're at the top, middle, or low end of the volume range. Same applies for any given percentage change - I used 5% as an arbitrary example.

People want to be able to turn the volume all the way down and have the sound practically muted,

I agree that I want the sound muted when the volume is all the way down, but I don't have any problem with the logarithmic volume control that I implemented - it sounds muted to me when the volume is at 0%.

Based on my understanding of the article, it seems like your implementation might be truer to how a physical volume attenuator works. And perhaps my approach aims for something different: that a given percentage reduction in volume should result in the same percentage reduction in perceived loudness.

One last point I'll add is that to my ears, my logarithmic attenuation sounds similar to whatever attenuation function apple is using everywhere e.g. Music app, system volume settings, etc.

Response to other questions

You also mentioned:

Also, forgive me -- maybe I'm missing something -- but doesn't the "flat" volume control profile not do what you are proposing here, i.e. spread the logarithmic attenuation range evenly over the -30.00 to 0 range of the AirPlay volume?

I don't think so. Here is a graph of my logarithmic control (red) and your flat control (blue). The x-axis is the airplay volume (range of [-30, 0]). The y-axis is the resulting decibel level that these attenuation algorithms would set. This graph rendering assumes a min_db value of -10238 and a max_db value of 400 - this is the default min and max on a stock raspberry pi soundcard.

Here is a PNG of the same graph, in case the graph website doesn't load:
desmos-graph

Other thoughts

Again, no worries if you don't want to merge this. I recently realized that I could probably accomplish the same thing by setting ignore_volume_control = "yes" and using run_this_when_volume_is_set = /path/to/my/logarithmic/control/script .

@mikebrady
Copy link
Owner

Many thanks for the reply and for all the work you've done on this. Let me digest it for a few days, please.

@dasl-
Copy link
Author

dasl- commented Jul 9, 2023

sure, no rush at all.

@mikebrady
Copy link
Owner

mikebrady commented Jul 10, 2023

Thanks again for all this.

What has been confusing me is the use of the term "logarithmic". The volume-to-attenuation functions, called *_vol2attn, transfer the -30 to 0 AirPlay volume setting to an attenuation denominated in decibels times 100 (dB * 100). Decibels are, of course, a version of the logarithm of the output level.
That means that the "flat" transfer function flat_vol2attn is truly logarithmic -- a change of 1.0 in AirPlay volume always results in a change in attenuation of the same number of decibels wherever the AirPlay volume is in its range. E.g. a change in AirPlay volume from -30.0 to -29.0 will change the attenuation by the same number of decibels as a change in AirPlay volume from -10.0 to -9.0.

However, as the tangentsoft author points out, "commercial volume controls don’t have true log curves" and advances two reasons for this. The first is the difficulty of actually building a log response into a real potentiometer.

The second reason, though, "has to do with the way volume controls are typically used. Audio systems are designed so that the volume control is set near the top of their range most of the time in common use. Since you want fine control in the range you use the volume control most often, the control is designed to have fine attenuation rates at the top end of the range. But, because our ears are so sensitive to soft sounds, you wouldn’t want fine control across the entire range. People want to be able to turn the volume all the way down and have the sound practically muted, even though an attenuator is not a mute control. The solution is to increase the attenuation rate towards the end of the control range, so that the control’s total attenuation is very high."

That is, the attenuation rate at the bottom of the volume range is greater than at the top of the range. In other words, the tangentsoft writer is making the case that real attenuators are not actually logarithmic for a good reason. That's what the "standard" vol2attn() function tries to model -- a faster change in attenuation for a given change in the volume at the lower end of the volume control range. So, it's [deliberately] not a logarithmic transfer function.

It seems to me that your new transfer function, logarithmic_vol2attn(), is similarly not a logarithmic transfer function. Instead, it does what the tangentsoft author had in mind when they wrote that real attenuators "increase the attenuation rate towards the end of the control range, so that the control’s total attenuation is very high". Your transfer function smoothly increases the attenuation rate at the lower end of the AirPlay volume range.

Whaddya think?

(If I'm right, then the issue is simply one of terminology: the term logarithmic_vol2attn is confusing. If it was called tapered_vol2attn, or even dasl_vol2attn (!), my reservations would be gone.)

@dasl-
Copy link
Author

dasl- commented Jul 12, 2023

Hmm, perhaps I was misinterpreting what the tangentsoft author meant. I agree that my attenuation function, like your standard attenuation function, increases the rate of decibel attenuation at the lower end of the x-axis. When I said that I don't want the rate of attenuation to fluctuate with the x-axis, I intended to say that I don't want the rate of perceived loudness to fluctuate with the x-axis (i.e. 5% change in airplay volume should always result in 5% change in perceived loudness). But of course perceived loudness and decibels are two different scales.

I think we are both thinking about things similarly, but perhaps using different terminology. I think it can be hard because the terms depend on precisely what we are referring to. My attenuation function is clearly a logarithmic curve - it is essentially a variant of y = log(x) - thus I called it "logarithmic". But I think you are saying that decibels are already logarithmic, so it makes sense to call the flat attenuation logarithmic (I would have called it linear :).

I don't have a strong feeling about whose naming conventions are better, so I'm happy to rename mine to either of your suggestions.

Finally, I'd like to add some more notes on decibels and perceived loudness. I found this article: https://salfordacoustics.co.uk/sound-waves/waves-transverse-introduction/decibel-scale

It says:

When a sound is perceived to double in loudness, this corresponds to roughly an increase in 10 dB.

Many other sources appear to agree that humans perceive a sound to double in loudness after a 10 dB increase. I believe my attenuation function preserves this property. Here are my findings in spreadsheet form.

To go into more depth about what the spreadsheet shows:

  1. When airplay volume is 0, this corresponds to 100% volume slider position. On my graph, this corresponds to a y-axis of 400 millibels (4 decibels)

  2. When airplay volume is -15, this corresponds to 50% volume slider position. We want this to be half as loud, thus 10 decibels lower. When airplay volume is -15, my attenuation function will output -600 millibels (-6 decibels). Note that this is a 10 dB difference from 100% volume, as desired.

  3. When airplay volume is -22.5, this corresponds to a 25% volume slider position. We want this to be half as loud as what we got in (2), thus another 10 dB lower. When airplay volume is -22.5, my attenuation function will output -1600 millibels (-16 dB). Note that this is again a 10 dB difference from 50% volume, as desired.

  4. When airplay volume is -26.25, this correponds to a 12.5% volume slider position. We want this to be half as loud as what we got in (3), thus another 10 dB lower. When airplay volume is -26.25, my attenuation function will output -2600 millibels (-26 dB). Note that this is again a 10 dB difference from 25% volume, as desired.

I'll think a bit more about this, but will likely push a change to rename the function in the coming days! Let me know if you have any more suggestions.

@mikebrady
Copy link
Owner

Thanks for that, and for your explanation of the rationale behind the relationship between the AirPlay volume and the attenuation, which looks plausible and certainly worth trying out. I rather like dasl_taper_vol2attn myself... 🙂

@dasl-
Copy link
Author

dasl- commented Jul 14, 2023

ok, rename is complete! lmk if you think of any other feedback.

@mikebrady mikebrady changed the title Add logarithmic volume_control_profile mode Add new "dasl_tapered" volume control profile Jul 17, 2023
@mikebrady mikebrady merged commit 57061df into mikebrady:development Jul 17, 2023
9 checks passed
@mikebrady
Copy link
Owner

mikebrady commented Jul 17, 2023

Many thanks for all your work on this. And dang, I managed to misspell your name in the commit -- apologies.

Anyway, I have two more suggestions/observations.

  1. Imagine a DAC with an attenuator range of 30 dB -- say from +4 to -26 for convenience here. (They do exist -- many of the low-cost USB DACs are like this.) With your new "dasl" transfer curve, the lower part of the AirPlay range, from -26.25 down to -30, would result in no change in the output attenuation, as it would be clamped at -26. So how about a slight modification: pick the least attenuation (i.e. closest to 0.00) of either your "dasl" curve or the flat "curve". I notice that the transition from the dasl curve to the flat curve occurs at -24.516 on the AirPlay volume. Whaddya think?
  2. The maximum attenuation provided by the software attenuator built in to SPS is 1/65536 or -96.3295986125 dB. Would you consider making that 1000 number 963.3? (Not sure if it makes any sense, TBH -- maybe this is just my tidy mind.) [Update] Forget it -- this really was my tidy mind; the 1000 is the 10dB drop you want when you halve the interval...

@dasl-
Copy link
Author

dasl- commented Jul 18, 2023

Hah, no worries :)

  1. Interesting idea. A friend of mine has this USB sound card for his raspberry pi. It has a volume range from 0 to 30, similar to as you mentioned. While I was experimenting with his sound card, I realized that it seems to adjust volume differently than the built in raspberry pi sound card. That is, for the USB sound card, a value of 15 sounds half as loud as 30. And 7.5 sounds half as loud as 15 (I would call this "linear" adjustment, but we might disagree on terminology :). This is of course in contrast to the built in sound card, where I believe that a decrease of 10 decibels results in a halving of loudness.

So in my sample size of one sound card, it would be more suited to use the "flat" volume control profile. I'm curious if you're aware of any specific USB sound card models that both have a very limited volume adjustment range and would be a good fit for the dasl_tapered volume control profile (i.e. is your imagined DAC purely hypothetical)? I question whether any of them are actually using "decibels" as the unit of measurement if they only have a range of 30 - that doesn't seem like a large enough range to be very expressive in the decibel scale. So I wonder if they are all intended to be used with a "flat" profile.

Btw, I graphed the dasl_tapered curve against what the flat curve would look like with this hypothetical DAC that has a range of -26 to 4 dB: https://www.desmos.com/calculator/mqtitkizvn

  1. Yes, I believe the 1000 number is crucial to maintaining the property that a doubling in airplay volume percentage will result in a 10 dB change. If we used 963.3, then a doubling in airplay volume percentage would result in a 9.633 dB change.

I got nerd sniped and actually wrote a proof that my volume attenuation function maintains the property that for a doubling in volume percentage, decibels will increase by 10. If you're interested... https://drive.google.com/file/d/12BD3hYW8ctQKSK1uiaPddeFcgrbePU4f/view?usp=drive_link

@mikebrady
Copy link
Owner

mikebrady commented Jul 18, 2023

Thanks for all that.

Some devices with CMedia chips, for example this one, have a 30 dB range. They provide the range of attenuations in decibels (typical 0 to -30).

It seems to me that we have to assume that the dB values, where provided, are real.

(SPS will only use a device's mixer if it permits reading and writing attenuation in dB. If the mixer controls are not available in dB, then we can't really say anything reliable about them, and SPS will not use them.)

Here is a graph of the situation with the 30 dB attenuator.

You can see the problem on the y = -3000 milliBels line at the bottom left: from AirPlay volume -30 to -25.25, the attenuation will be stuck at -3000.

So, would you be okay with me making the modification I proposed? In that case, the attenuation would rise on the flat curve from -30 to -24.516 and would then rise on the true dasl-tapered curve from there up to 0. It seem to me you'd have the best outcome for a situation like this.

If a mixer provides an attenuation range even from about 50dB up, the flat curve is probably never going to be used. But if the attenuation range is down around 30 dB, the modified dasl taper would, IMHO, be better behaved.

@dasl-
Copy link
Author

dasl- commented Jul 18, 2023

So, would you be okay with me making the modification I proposed?

Yes, I'm fine with that. Like you said, it would not even noticeably affect my sound cards because they have such a large attenuation range already.

I'm just not sure if any of these sound cards with small attenuation ranges are intended to be used with a "tapered" volume attenuation function. I wouldn't be surprised if a "flat" attenuation function makes their volume adjustment map better to "perceptual loudness".

It seems to me that we have to assume that the dB values, where provided, are real.

That's a good point - maybe we should trust them. I'm just not personally convinced yet :)

@mikebrady
Copy link
Owner

Okay. I’ll set about in the next couple of days. It can’t hurt…

@mikebrady
Copy link
Owner

I've just made that update. Thanks again for all your work on this -- it's pretty neat.

@dasl-
Copy link
Author

dasl- commented Jul 20, 2023

nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants