Scraping process to download each summary vote totals (JPG or PDF files) published by the Salvador government. There are 16 million votes approximately and the summary vote totals will be used by the independent project of the Guatemalan non-profit organization FundacionHCG.org and FiscalDigital.net
Please setup each environment variable value before execute any script or python file:
Create a new .env file inside the src directory and use the environment variables as you can see in the example.env file just change the values. This file will be ignored and never will be committed to the repository.
The environement variables below are required:
- BUCKET_NAME=example-bucket-name
- AWS_ACCESS_KEY_ID=EXAMPLE1234DEMO
- AWS_SECRET_ACCESS_KEY=ExAmPle1d2Em3o4Acce56ss7Key
- AWS_DEFAULT_REGION=us-west-1
- BROWSER_PATH=src/scraping/browser/app/chrome-linux64/chrome
- BROWSER_DRIVER_PATH=src/scraping/browser/driver/chromedriver-linux64/chromedriver
Read more README.md
Execute each script from the elecciones-salvador directory root.
Provides methods for listing, uploading, downloading, and deleting files in an S3 bucket.
# from elecciones-salvador root directory
$promt> python src/aws/awss3.py --help
List all objects in the bucket.
# from elecciones-salvador root directory
$promt> python src/aws/awss3.py --list
Upload a file to a bucket.
# from elecciones-salvador root directory
$promt> python src/aws/awss3.py --upload src/data/0_raw/bike.jpg
Download a file from a bucket.
# from elecciones-salvador root directory
$promt> python src/aws/awss3.py --download bike.jpg
Delete a file from a bucket.
# from elecciones-salvador root directory
$promt> python src/aws/awss3.py --delete bike.jpg
A demon that performs scraping and file uploading tasks.
# from elecciones-salvador root directory
$promt> python src/scraping/demon.py --help
TODO: Scrapes data from a website and saves it in the src/data/0_raw directory.
# from elecciones-salvador root directory
$promt> python src/scraping/demon.py --scraper
Uploads files from src/data/0_raw directory to an S3 bucket.
# from elecciones-salvador root directory
$promt> python src/scraping/demon.py --upload