Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AIP-58: Add Airflow ObjectStore (AFS) #34729

Merged
merged 83 commits into from
Oct 27, 2023
Merged
Changes from 1 commit
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
0c456ee
IO
bolkedebruin Sep 28, 2023
d9c92f0
Further work
bolkedebruin Sep 28, 2023
bda96de
Add Airflow FS
bolkedebruin Oct 3, 2023
c064314
Add fsspec dependencies
bolkedebruin Oct 8, 2023
29bf6b8
Move stuff to provider packages
bolkedebruin Oct 4, 2023
ad22ed9
Add fsspec
bolkedebruin Oct 4, 2023
58753fa
Use provider style plugins
bolkedebruin Oct 4, 2023
0c4700c
Add plugin registration
bolkedebruin Oct 4, 2023
f86c090
Move exception inline
bolkedebruin Oct 4, 2023
c7774bd
Clean ups
bolkedebruin Oct 5, 2023
133fc4e
Make FileIO work with connection ids
bolkedebruin Oct 5, 2023
ea5584a
Add simple mounts
bolkedebruin Oct 5, 2023
10d7582
Add simple combinations
bolkedebruin Oct 5, 2023
6082452
Allow unmount to use str or Mount
bolkedebruin Oct 5, 2023
11a1f2d
Pre commit stuff - what a mess that creates :/
bolkedebruin Oct 6, 2023
3d57e55
PY38 fixes
bolkedebruin Oct 6, 2023
fbb151b
Address pre-commit
bolkedebruin Oct 6, 2023
ba88b12
Support contexts and PathLib concatenation
bolkedebruin Oct 6, 2023
390c80d
Add s3fs to devel
bolkedebruin Oct 17, 2023
e8331cc
Use deterministic endpoints and generate fsid if not available
bolkedebruin Oct 7, 2023
3d1595f
Fix table
bolkedebruin Oct 8, 2023
48f47b4
Use PathLike object
bolkedebruin Oct 17, 2023
b2f5e26
Fix mypy and test
bolkedebruin Oct 17, 2023
772b754
Simplify implementaton
bolkedebruin Oct 18, 2023
8a97977
Use ObjectStoragePath directly
bolkedebruin Oct 18, 2023
0a46254
Fix docstrings
bolkedebruin Oct 18, 2023
b6b92b2
Check if samestore (maybe just switch to ObjectStorePath copy)
bolkedebruin Oct 18, 2023
fdcd1ba
Use class name instead of type
bolkedebruin Oct 18, 2023
b62778f
Use shutil for copying between stores
bolkedebruin Oct 18, 2023
45c8183
Make sure to set alias only when not specified
bolkedebruin Oct 18, 2023
1c1c2a9
Use backing copy
bolkedebruin Oct 18, 2023
328eb00
Fix test
bolkedebruin Oct 18, 2023
95da859
Fix test
bolkedebruin Oct 18, 2023
6a1b525
Implement caching for filesystems
bolkedebruin Oct 19, 2023
41785d0
Move FileTransferOperator to provider package
bolkedebruin Oct 19, 2023
44a03ae
Pin dependencies
bolkedebruin Oct 19, 2023
bd8d091
Pin aiobotocore until new release of fsspec
bolkedebruin Oct 19, 2023
6654d59
Address version name
bolkedebruin Oct 19, 2023
7df70f3
Don't copy paste too much
bolkedebruin Oct 19, 2023
cb6f442
Use aws infrastructure for getting a session
bolkedebruin Oct 19, 2023
1e40842
Make sure endpoint_url is honored
bolkedebruin Oct 19, 2023
42e897a
Remove s3fs from main and keep in provider
bolkedebruin Oct 19, 2023
c41dcfd
Use service config
bolkedebruin Oct 19, 2023
42908de
remove s3fs when testing aws
bolkedebruin Oct 19, 2023
94ec2e3
Make sure prod can build
bolkedebruin Oct 19, 2023
6a5de18
Fix tests to not depend on s3fs
bolkedebruin Oct 19, 2023
9d7fcce
Readd s3fs to setup.py
bolkedebruin Oct 19, 2023
8b3b194
Fix issues with docs
bolkedebruin Oct 20, 2023
caf2253
fix link
bolkedebruin Oct 20, 2023
841dbdc
Add example dag
bolkedebruin Oct 20, 2023
96516af
Optimize copy
bolkedebruin Oct 20, 2023
f085180
Extra
bolkedebruin Oct 20, 2023
6dc8c1a
Regen images
bolkedebruin Oct 20, 2023
73a9f57
Fix docs
bolkedebruin Oct 21, 2023
74ae073
Update tests not to be dependent on s3fs
bolkedebruin Oct 21, 2023
b2fd6fc
Moved example test
bolkedebruin Oct 21, 2023
6dae720
Add stat_result as a way to unify info and traditional stat_result
bolkedebruin Oct 21, 2023
7b6a71e
Clean up
bolkedebruin Oct 21, 2023
1f125da
Add extra docs
bolkedebruin Oct 21, 2023
91e0e94
Fix docs
bolkedebruin Oct 21, 2023
acf81f5
Add words
bolkedebruin Oct 21, 2023
671924c
Improve copying
bolkedebruin Oct 22, 2023
b1a4cbb
Upgrade fsspec and relax aiobotocore requirements
bolkedebruin Oct 22, 2023
4dea223
Update docs/apache-airflow/core-concepts/objectstorage.rst
bolkedebruin Oct 21, 2023
002de89
Update docs/apache-airflow/core-concepts/objectstorage.rst
bolkedebruin Oct 21, 2023
159c908
Revert "Upgrade fsspec and relax aiobotocore requirements"
bolkedebruin Oct 22, 2023
958eb51
Relax aiobotocore
bolkedebruin Oct 22, 2023
ec870c6
Make copy work as expected
bolkedebruin Oct 23, 2023
a58000d
Revert "Relax aiobotocore"
bolkedebruin Oct 23, 2023
b217eea
Make rename work only within same store
bolkedebruin Oct 23, 2023
855485a
Fix tests
bolkedebruin Oct 23, 2023
d07d044
Fix test not te reuse alias
bolkedebruin Oct 23, 2023
ed2f928
Ensure templated fields for xcom
bolkedebruin Oct 23, 2023
cb8e4a4
Improve handling of existing directories
bolkedebruin Oct 23, 2023
980c3ee
Set aiobotocore to 2.7.0
bolkedebruin Oct 23, 2023
1f26001
Allow larger versions of aiobotocore
bolkedebruin Oct 23, 2023
8f8912b
Update airflow/providers/amazon/aws/fs/s3.py
bolkedebruin Oct 24, 2023
d7fc935
Add tests for s3fs
bolkedebruin Oct 24, 2023
1af0dc3
Add example dag
bolkedebruin Oct 24, 2023
836131d
Improve example
bolkedebruin Oct 24, 2023
2e1cba6
Improve example
bolkedebruin Oct 24, 2023
88b9216
Add tutorial and improve docs
bolkedebruin Oct 27, 2023
5099f45
Add extra
bolkedebruin Oct 27, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix test not te reuse alias
  • Loading branch information
bolkedebruin committed Oct 27, 2023
commit d07d044eec05e16e30bd1ff2fbb5946fda823a77
8 changes: 4 additions & 4 deletions tests/io/store/test_store.py
Original file line number Diff line number Diff line change
Expand Up @@ -177,21 +177,21 @@ def test_move_remote(self):

def test_copy_remote_remote(self):
# foo = xxx added to prevent same fs token
attach("fakefs", fs=FakeRemoteFileSystem(auto_mkdir=True, foo="bar"))
attach("fakefs2", fs=FakeRemoteFileSystem(auto_mkdir=True, foo="baz"))
attach("ffs", fs=FakeRemoteFileSystem(auto_mkdir=True, foo="bar"))
attach("ffs2", fs=FakeRemoteFileSystem(auto_mkdir=True, foo="baz"))

dir_src = f"/tmp/{str(uuid.uuid4())}"
dir_dst = f"/tmp/{str(uuid.uuid4())}"
key = "foo/bar/baz.txt"

# note we are dealing with object storage characteristics
# while working on a local filesystem, so it might feel not intuitive
_from = ObjectStoragePath(f"fakefs://{dir_src}")
_from = ObjectStoragePath(f"ffs://{dir_src}")
_from_file = _from / key
_from_file.touch()
assert _from_file.exists()

_to = ObjectStoragePath(f"fakefs2://{dir_dst}")
_to = ObjectStoragePath(f"ffs2://{dir_dst}")
_from.copy(_to)

assert _to.exists()
Expand Down