-
Notifications
You must be signed in to change notification settings - Fork 29
Set Schedule
Datalab uses airflow to control the scheduled task. You may find detailed documentation from https://airflow.apache.org/. This page will illustrate only a simple example of scheduling jupyter notebook in datalab.
Directory of dags is set to /app/dags. You can create a new folder named dags.
Inside your dags folder, you can create a new file named with your purpose such as hello_world.py. It is recommended to use the same name as the dag_id inside the file.
from datetime import datetime
from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.python_operator import PythonOperator
from airflow.operators.bash_operator import BashOperator
from dsutil import NbExecuter
# Define default args
default_args = {
'owner': 'user',
'on_failure_callback': lambda context: True
}
# Define DAG setting
dag = DAG('hello_world', description='First ',
schedule_interval='04 20 * * *',
start_date=datetime(2017, 10, 19),
default_args=default_args,
catchup=False)
# Define DAG components
first_nb = PythonOperator(
task_id='first_nb',
python_callable=NbExecuter.execute_nb2,
provide_context=True,
op_kwargs={'path': '/app/user-ws/hello_world/notebook/first_nb.ipynb'},
dag=dag
)
second_nb = PythonOperator(
task_id='first_nb',
python_callable=NbExecuter.execute_nb2,
provide_context=True,
op_kwargs={'path': '/app/user-ws/hello_world/notebook/second_nb.ipynb'},
dag=dag
)
# Define dependencies
first_nb >> second_nb
After you saved the file, airflow will auto detect and register it. The process takes a couple of minutes. You can then find the registered dag on the airflow webserver via port 9090.
On the airflow web interface, you can toggle the dag from off to on. The dag will start in a minute.
Besides using airflow's debug portal, you may also execute the following command. It does not need to wait airflow to refresh
from dsutil import NbExecuter
NbExecuter.execute_nb('/app/user-ws/hello_world/notebook/first_nb.ipynb', '2018-10-02')