Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement read-only local cache #2048

Merged
merged 11 commits into from
Jan 26, 2024
1 change: 1 addition & 0 deletions docs/Configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@ configuration variables
* `SCCACHE_DIR` local on disk artifact cache directory
* `SCCACHE_CACHE_SIZE` maximum size of the local on disk cache i.e. `2G` - default is 10G
* `SCCACHE_DIRECT` enable/disable preprocessor caching (see [the local doc](Local.md))
* `SCCACHE_LOCAL_RW_MODE` the mode that the cache will operate in (`READ_ONLY` or `READ_WRITE`)

#### s3 compatible

Expand Down
8 changes: 8 additions & 0 deletions docs/Local.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,11 @@ Configuration options and their default values:
See where to write the config in [the configuration doc](Configuration.md).

*Note that preprocessor caching is currently only implemented for GCC and Clang and when using local storage.*

## Read-only cache mode

By default, the local cache operates in read/write mode. The `SCCACHE_LOCAL_RW_MODE` environment variable can be set to `READ_ONLY` (or `READ_WRITE`) to modify this behavior.

You can use read-only mode to prevent sccache from writing new cache items to the disk. This can be useful, for example, if you want to use items that have already been cached, but not add new ones to the cache.

Note that this feature is only effective if you already have items in your cache. Using this option on an empty cache will cause sccache to simply do nothing, just add overhead.
15 changes: 6 additions & 9 deletions src/cache/cache.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
use crate::cache::azure::AzureBlobCache;
use crate::cache::disk::DiskCache;
#[cfg(feature = "gcs")]
use crate::cache::gcs::{GCSCache, RWMode};
use crate::cache::gcs::GCSCache;
#[cfg(feature = "gha")]
use crate::cache::gha::GHACache;
#[cfg(feature = "memcached")]
Expand Down Expand Up @@ -112,8 +112,8 @@ impl fmt::Debug for Cache {
}
}

/// CacheMode is used to repreent which mode we are using.
#[derive(Debug)]
/// CacheMode is used to represent which mode we are using.
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub enum CacheMode {
/// Only read cache from storage.
ReadOnly,
Expand Down Expand Up @@ -575,17 +575,12 @@ pub fn storage_from_config(
}) => {
debug!("Init gcs cache with bucket {bucket}, key_prefix {key_prefix}");

let gcs_read_write_mode = match rw_mode {
config::GCSCacheRWMode::ReadOnly => RWMode::ReadOnly,
config::GCSCacheRWMode::ReadWrite => RWMode::ReadWrite,
};

let storage = GCSCache::build(
bucket,
key_prefix,
cred_path.as_deref(),
service_account.as_deref(),
gcs_read_write_mode,
(*rw_mode).into(),
credential_url.as_deref(),
)
.map_err(|err| anyhow!("create gcs cache failed: {err:?}"))?;
Expand Down Expand Up @@ -660,12 +655,14 @@ pub fn storage_from_config(

let (dir, size) = (&config.fallback_cache.dir, config.fallback_cache.size);
let preprocessor_cache_mode_config = config.fallback_cache.preprocessor_cache_mode;
let rw_mode = config.fallback_cache.rw_mode.into();
debug!("Init disk cache with dir {:?}, size {}", dir, size);
Ok(Arc::new(DiskCache::new(
dir,
size,
pool,
preprocessor_cache_mode_config,
rw_mode,
)))
}

Expand Down
18 changes: 17 additions & 1 deletion src/cache/disk.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
// See the License for the specific language governing permissions and
// limitations under the License.

use crate::cache::{Cache, CacheRead, CacheWrite, Storage};
use crate::cache::{Cache, CacheMode, CacheRead, CacheWrite, Storage};
use crate::compiler::PreprocessorCacheEntry;
use crate::lru_disk_cache::LruDiskCache;
use crate::lru_disk_cache::{Error as LruError, ReadSeek};
Expand Down Expand Up @@ -65,6 +65,7 @@ pub struct DiskCache {
pool: tokio::runtime::Handle,
preprocessor_cache_mode_config: PreprocessorCacheModeConfig,
preprocessor_cache: Arc<Mutex<LazyDiskCache>>,
rw_mode: CacheMode,
}

impl DiskCache {
Expand All @@ -74,6 +75,7 @@ impl DiskCache {
max_size: u64,
pool: &tokio::runtime::Handle,
preprocessor_cache_mode_config: PreprocessorCacheModeConfig,
rw_mode: CacheMode,
) -> DiskCache {
DiskCache {
lru: Arc::new(Mutex::new(LazyDiskCache::Uninit {
Expand All @@ -88,6 +90,7 @@ impl DiskCache {
.into_os_string(),
max_size,
})),
rw_mode,
}
}
}
Expand Down Expand Up @@ -130,6 +133,11 @@ impl Storage for DiskCache {
// We should probably do this on a background thread if we're going to buffer
// everything in memory...
trace!("DiskCache::finish_put({})", key);

if self.rw_mode == CacheMode::ReadOnly {
return Err(anyhow!("Cannot write to a read-only cache"));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please add a test to trigger this error ? thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please add a test to trigger this error ? thanks

@sylvestre Done.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please check for the error message string too ?
thanks

}

let lru = self.lru.clone();
let key = make_key_path(key);

Expand All @@ -143,6 +151,10 @@ impl Storage for DiskCache {
.await?
}

async fn check(&self) -> Result<CacheMode> {
Ok(self.rw_mode)
}

fn location(&self) -> String {
format!("Local disk: {:?}", self.lru.lock().unwrap().path())
}
Expand Down Expand Up @@ -171,6 +183,10 @@ impl Storage for DiskCache {
key: &str,
preprocessor_cache_entry: PreprocessorCacheEntry,
) -> Result<()> {
if self.rw_mode == CacheMode::ReadOnly {
return Err(anyhow!("Cannot write to a read-only cache"));
}

let key = normalize_key(key);
let mut buf = vec![];
preprocessor_cache_entry.serialize_to(&mut buf)?;
Expand Down
23 changes: 8 additions & 15 deletions src/cache/gcs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
// See the License for the specific language governing permissions and
// limitations under the License.

use crate::cache::CacheMode;
use crate::errors::*;
use opendal::Operator;
use opendal::{layers::LoggingLayer, services::Gcs};
Expand All @@ -21,18 +22,10 @@ use reqwest::Client;
use serde::Deserialize;
use url::Url;

#[derive(Copy, Clone)]
pub enum RWMode {
ReadOnly,
ReadWrite,
}

impl RWMode {
fn to_scope(self) -> &'static str {
match self {
RWMode::ReadOnly => "https://www.googleapis.com/auth/devstorage.read_only",
RWMode::ReadWrite => "https://www.googleapis.com/auth/devstorage.read_write",
}
fn rw_to_scope(mode: CacheMode) -> &'static str {
match mode {
CacheMode::ReadOnly => "https://www.googleapis.com/auth/devstorage.read_only",
CacheMode::ReadWrite => "https://www.googleapis.com/auth/devstorage.read_write",
}
}

Expand All @@ -46,13 +39,13 @@ impl GCSCache {
key_prefix: &str,
cred_path: Option<&str>,
service_account: Option<&str>,
rw_mode: RWMode,
rw_mode: CacheMode,
credential_url: Option<&str>,
) -> Result<Operator> {
let mut builder = Gcs::default();
builder.bucket(bucket);
builder.root(key_prefix);
builder.scope(rw_mode.to_scope());
builder.scope(rw_to_scope(rw_mode));

if let Some(service_account) = service_account {
builder.service_account(service_account);
Expand All @@ -67,7 +60,7 @@ impl GCSCache {
.map_err(|err| anyhow!("gcs credential url is invalid: {err:?}"))?;

builder.customed_token_loader(Box::new(TaskClusterTokenLoader {
scope: rw_mode.to_scope().to_string(),
scope: rw_to_scope(rw_mode).to_string(),
url: cred_url.to_string(),
}));
}
Expand Down
71 changes: 42 additions & 29 deletions src/compiler/c.rs
Original file line number Diff line number Diff line change
Expand Up @@ -340,42 +340,51 @@ where
let mut updated = false;
let hit = preprocessor_cache_entry
.lookup_result_digest(preprocessor_cache_mode_config, &mut updated);

let mut update_failed = false;
if updated {
// Time macros have been found, we need to update
// the preprocessor cache entry. See [`PreprocessorCacheEntry::result_matches`].
debug!(
"Preprocessor cache updated because of time macros: {preprocessor_key}"
);
storage.put_preprocessor_cache_entry(

if let Err(e) = storage.put_preprocessor_cache_entry(
preprocessor_key,
preprocessor_cache_entry,
)?;
) {
debug!("Failed to update preprocessor cache: {}", e);
update_failed = true;
}
Comment on lines -349 to +358
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure about this behavior. It's based on how compiler.rs handles errors.

Realistically, errors shouldn't be used here, but it seems to be how it's handled with GCS read-only, so I did it the same way. I thought about changing it, but I'd like to hear the input about someone more knowledgeable about sccache's internals first. Storage::check() seems to be a great candidate, although it only seems to be used for a print and is very expensive for GCS (which could be changed, of course).

}
if let Some(key) = hit {
debug!("Preprocessor cache hit: {preprocessor_key}");
// A compiler binary may be a symlink to another and
// so has the same digest, but that means
// the toolchain will not contain the correct path
// to invoke the compiler! Add the compiler
// executable path to try and prevent this
let weak_toolchain_key =
format!("{}-{}", executable.to_string_lossy(), executable_digest);
return Ok(HashResult {
key,
compilation: Box::new(CCompilation {
parsed_args: parsed_args.to_owned(),
#[cfg(feature = "dist-client")]
// TODO or is it never relevant since dist?
preprocessed_input: vec![],
executable: executable.to_owned(),
compiler: compiler.to_owned(),
cwd: cwd.to_owned(),
env_vars: env_vars.to_owned(),
}),
weak_toolchain_key,
});
} else {
debug!("Preprocessor cache miss: {preprocessor_key}");

if !update_failed {
if let Some(key) = hit {
debug!("Preprocessor cache hit: {preprocessor_key}");
// A compiler binary may be a symlink to another and
// so has the same digest, but that means
// the toolchain will not contain the correct path
// to invoke the compiler! Add the compiler
// executable path to try and prevent this
let weak_toolchain_key =
format!("{}-{}", executable.to_string_lossy(), executable_digest);
return Ok(HashResult {
key,
compilation: Box::new(CCompilation {
parsed_args: parsed_args.to_owned(),
#[cfg(feature = "dist-client")]
// TODO or is it never relevant since dist?
preprocessed_input: vec![],
executable: executable.to_owned(),
compiler: compiler.to_owned(),
cwd: cwd.to_owned(),
env_vars: env_vars.to_owned(),
}),
weak_toolchain_key,
});
} else {
debug!("Preprocessor cache miss: {preprocessor_key}");
}
}
}
}
Expand Down Expand Up @@ -491,8 +500,12 @@ where
.collect();
files.sort_unstable_by(|a, b| a.1.cmp(&b.1));
preprocessor_cache_entry.add_result(start_of_compilation, &key, files);
storage
.put_preprocessor_cache_entry(&preprocessor_key, preprocessor_cache_entry)?;

if let Err(e) = storage
.put_preprocessor_cache_entry(&preprocessor_key, preprocessor_cache_entry)
{
debug!("Failed to update preprocessor cache: {}", e);
}
Comment on lines -494 to +508
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

}
}

Expand Down
7 changes: 6 additions & 1 deletion src/compiler/compiler.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1381,7 +1381,7 @@ where
mod test {
use super::*;
use crate::cache::disk::DiskCache;
use crate::cache::{CacheRead, PreprocessorCacheModeConfig};
use crate::cache::{CacheMode, CacheRead, PreprocessorCacheModeConfig};
use crate::mock_command::*;
use crate::test::mock_storage::MockStorage;
use crate::test::utils::*;
Expand Down Expand Up @@ -1824,6 +1824,7 @@ LLVM version: 6.0",
use_preprocessor_cache_mode: preprocessor_cache_mode,
..Default::default()
},
CacheMode::ReadWrite,
);
// Write a dummy input file so the preprocessor cache mode can work
std::fs::write(f.tempdir.path().join("foo.c"), "whatever").unwrap();
Expand Down Expand Up @@ -1949,6 +1950,7 @@ LLVM version: 6.0",
use_preprocessor_cache_mode: preprocessor_cache_mode,
..Default::default()
},
CacheMode::ReadWrite,
);
// Write a dummy input file so the preprocessor cache mode can work
std::fs::write(f.tempdir.path().join("foo.c"), "whatever").unwrap();
Expand Down Expand Up @@ -2236,6 +2238,7 @@ LLVM version: 6.0",
use_preprocessor_cache_mode: preprocessor_cache_mode,
..Default::default()
},
CacheMode::ReadWrite,
);
let storage = Arc::new(storage);
// Write a dummy input file so the preprocessor cache mode can work
Expand Down Expand Up @@ -2361,6 +2364,7 @@ LLVM version: 6.0",
use_preprocessor_cache_mode: preprocessor_cache_mode,
..Default::default()
},
CacheMode::ReadWrite,
);
let storage = Arc::new(storage);
// Pretend to be GCC. Also inject a fake object file that the subsequent
Expand Down Expand Up @@ -2454,6 +2458,7 @@ LLVM version: 6.0",
use_preprocessor_cache_mode: preprocessor_cache_mode,
..Default::default()
},
CacheMode::ReadWrite,
);
let storage = Arc::new(storage);
// Pretend to be GCC.
Expand Down
Loading