SpecialistOff.NET / Вопросы / Статьи / Фрагменты кода / Резюме / Метки / Помощь / Файлы

Назад

Sub-commands


Метки: airflow apache airflow

backfill

Run subsections of a DAG for a specified date range. If reset_dag_run option is used, backfill will first prompt users whether airflow should clear all the previous dag_run and task_instances within the backfill date range. If rerun_failed_tasks is used, backfill will auto re-run the previous failed task instances within the backfill date range.

airflow backfill [-h] [-t TASK_REGEX] [-s START_DATE] [-e END_DATE] [-m] [-l]
                 [-x] [-i] [-I] [-sd SUBDIR] [--pool POOL]
                 [--delay_on_limit DELAY_ON_LIMIT] [-dr] [-v] [-c CONF]
                 [--reset_dagruns] [--rerun_failed_tasks] [-B]
                 dag_id

Positional Arguments

dag_id

The id of the dag

Named Arguments

-t, --task_regex

The regex to filter specific task_ids to backfill (optional)

-s, --start_date

Override start_date YYYY-MM-DD

-e, --end_date

Override end_date YYYY-MM-DD

-m, --mark_success

Mark jobs as succeeded without running them

Default: False

-l, --local

Run the task using the LocalExecutor

Default: False

-x, --donot_pickle

Do not attempt to pickle the DAG object to send over to the workers, just tell the workers to run their version of the code.

Default: False

-i, --ignore_dependencies

Skip upstream tasks, run only the tasks matching the regexp. Only works in conjunction with task_regex

Default: False

-I, --ignore_first_depends_on_past

Ignores depends_on_past dependencies for the first set of tasks only (subsequent executions in the backfill DO respect depends_on_past).

Default: False

-sd, --subdir

File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’

Default: “[AIRFLOW_HOME]/dags”

--pool

Resource pool to use

--delay_on_limit

Amount of time in seconds to wait when the limit on maximum active dag runs (max_active_runs) has been reached before trying to execute a dag run again.

Default: 1.0

-dr, --dry_run

Perform a dry run

Default: False

-v, --verbose

Make logging output more verbose

Default: False

-c, --conf

JSON string that gets pickled into the DagRun’s conf attribute

--reset_dagruns

if set, the backfill will delete existing backfill-related DAG runs and start anew with fresh, running DAG runs

Default: False

--rerun_failed_tasks

if set, the backfill will auto-rerun all the failed tasks for the backfill date range instead of throwing exceptions

Default: False

-B, --run_backwards

if set, the backfill will run tasks from the most recent day first. if there are tasks that depend_on_past this option will throw an exception

Default: False

list_dag_runs

List dag runs given a DAG id. If state option is given, it will onlysearch for all the dagruns with the given state. If no_backfill option is given, it will filter outall backfill dagruns for given dag id.

airflow list_dag_runs [-h] [--no_backfill] [--state STATE] dag_id

Positional Arguments

dag_id

The id of the dag

Named Arguments

--no_backfill

filter all the backfill dagruns given the dag id

Default: False

--state

Only list the dag runs corresponding to the state

list_tasks

List the tasks within a DAG

airflow list_tasks [-h] [-t] [-sd SUBDIR] dag_id

Positional Arguments

dag_id

The id of the dag

Named Arguments

-t, --tree

Tree view

Default: False

-sd, --subdir

File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’

Default: “[AIRFLOW_HOME]/dags”

clear

Clear a set of task instance, as if they never ran

airflow clear [-h] [-t TASK_REGEX] [-s START_DATE] [-e END_DATE] [-sd SUBDIR]
              [-u] [-d] [-c] [-f] [-r] [-x] [-xp] [-dx]
              dag_id

Positional Arguments

dag_id

The id of the dag

Named Arguments

-t, --task_regex

The regex to filter specific task_ids to backfill (optional)

-s, --start_date

Override start_date YYYY-MM-DD

-e, --end_date

Override end_date YYYY-MM-DD

-sd, --subdir

File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’

Default: “[AIRFLOW_HOME]/dags”

-u, --upstream

Include upstream tasks

Default: False

-d, --downstream

Include downstream tasks

Default: False

-c, --no_confirm

Do not request confirmation

Default: False

-f, --only_failed

Only failed jobs

Default: False

-r, --only_running

Only running jobs

Default: False

-x, --exclude_subdags

Exclude subdags

Default: False

-xp, --exclude_parentdag

Exclude ParentDAGS if the task cleared is a part of a SubDAG

Default: False

-dx, --dag_regex

Search dag_id as regex instead of exact string

Default: False

pause

Pause a DAG

airflow pause [-h] [-sd SUBDIR] dag_id

Positional Arguments

dag_id

The id of the dag

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’

Default: “[AIRFLOW_HOME]/dags”

unpause

Resume a paused DAG

airflow unpause [-h] [-sd SUBDIR] dag_id

Positional Arguments

dag_id

The id of the dag

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’

Default: “[AIRFLOW_HOME]/dags”

trigger_dag

Trigger a DAG run

airflow trigger_dag [-h] [-sd SUBDIR] [-r RUN_ID] [-c CONF] [-e EXEC_DATE]
                    dag_id

Positional Arguments

dag_id

The id of the dag

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’

Default: “[AIRFLOW_HOME]/dags”

-r, --run_id

Helps to identify this run

-c, --conf

JSON string that gets pickled into the DagRun’s conf attribute

-e, --exec_date

The execution date of the DAG

delete_dag

Delete all DB records related to the specified DAG

airflow delete_dag [-h] [-y] dag_id

Positional Arguments

dag_id

The id of the dag

Named Arguments

-y, --yes

Do not prompt to confirm reset. Use with care!

Default: False

pool

CRUD operations on pools

airflow pool [-h] [-s NAME SLOT_COUNT POOL_DESCRIPTION] [-g NAME] [-x NAME]
             [-i FILEPATH] [-e FILEPATH]

Named Arguments

-s, --set

Set pool slot count and description, respectively

-g, --get

Get pool info

-x, --delete

Delete a pool

-i, --import

Import pool from JSON file

-e, --export

Export pool to JSON file

variables

CRUD operations on variables

airflow variables [-h] [-s KEY VAL] [-g KEY] [-j] [-d VAL] [-i FILEPATH]
                  [-e FILEPATH] [-x KEY]

Named Arguments

-s, --set

Set a variable

-g, --get

Get value of a variable

-j, --json

Deserialize JSON variable

Default: False

-d, --default

Default value returned if variable does not exist

-i, --import

Import variables from JSON file

-e, --export

Export variables to JSON file

-x, --delete

Delete a variable

kerberos

Start a kerberos ticket renewer

airflow kerberos [-h] [-kt [KEYTAB]] [--pid [PID]] [-D] [--stdout STDOUT]
                 [--stderr STDERR] [-l LOG_FILE]
                 [principal]

Positional Arguments

principal

kerberos principal

Named Arguments

-kt, --keytab

keytab

Default: “airflow.keytab”

--pid

PID file location

-D, --daemon

Daemonize instead of running in the foreground

Default: False

--stdout

Redirect stdout to this file

--stderr

Redirect stderr to this file

-l, --log-file

Location of the log file

render

Render a task instance’s template(s)

airflow render [-h] [-sd SUBDIR] dag_id task_id execution_date

Positional Arguments

dag_id

The id of the dag

task_id

The id of the task

execution_date

The execution date of the DAG

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’

Default: “[AIRFLOW_HOME]/dags”

run

Run a single task instance

airflow run [-h] [-sd SUBDIR] [-m] [-f] [--pool POOL] [--cfg_path CFG_PATH]
            [-l] [-A] [-i] [-I] [--ship_dag] [-p PICKLE] [-int]
            dag_id task_id execution_date

Positional Arguments

dag_id

The id of the dag

task_id

The id of the task

execution_date

The execution date of the DAG

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’

Default: “[AIRFLOW_HOME]/dags”

-m, --mark_success

Mark jobs as succeeded without running them

Default: False

-f, --force

Ignore previous task instance state, rerun regardless if task already succeeded/failed

Default: False

--pool

Resource pool to use

--cfg_path

Path to config file to use instead of airflow.cfg

-l, --local

Run the task using the LocalExecutor

Default: False

-A, --ignore_all_dependencies

Ignores all non-critical dependencies, including ignore_ti_state and ignore_task_deps

Default: False

-i, --ignore_dependencies

Ignore task-specific dependencies, e.g. upstream, depends_on_past, and retry delay dependencies

Default: False

-I, --ignore_depends_on_past

Ignore depends_on_past dependencies (but respect upstream dependencies)

Default: False

--ship_dag

Pickles (serializes) the DAG and ships it to the worker

Default: False

-p, --pickle

Serialized pickle object of the entire dag (used internally)

-int, --interactive

Do not capture standard output and error streams (useful for interactive debugging)

Default: False

initdb

Initialize the metadata database

airflow initdb [-h]

list_dags

List all the DAGs

airflow list_dags [-h] [-sd SUBDIR] [-r]

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’

Default: “[AIRFLOW_HOME]/dags”

-r, --report

Show DagBag loading report

Default: False

dag_state

Get the status of a dag run

airflow dag_state [-h] [-sd SUBDIR] dag_id execution_date

Positional Arguments

dag_id

The id of the dag

execution_date

The execution date of the DAG

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’

Default: “[AIRFLOW_HOME]/dags”

task_failed_deps

Returns the unmet dependencies for a task instance from the perspective of the scheduler. In other words, why a task instance doesn’t get scheduled and then queued by the scheduler, and then run by an executor).

airflow task_failed_deps [-h] [-sd SUBDIR] dag_id task_id execution_date

Positional Arguments

dag_id

The id of the dag

task_id

The id of the task

execution_date

The execution date of the DAG

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’

Default: “[AIRFLOW_HOME]/dags”

task_state

Get the status of a task instance

airflow task_state [-h] [-sd SUBDIR] dag_id task_id execution_date

Positional Arguments

dag_id

The id of the dag

task_id

The id of the task

execution_date

The execution date of the DAG

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’

Default: “[AIRFLOW_HOME]/dags”

serve_logs

Serve logs generate by worker

airflow serve_logs [-h]

test

Test a task instance. This will run a task without checking for dependencies or recording its state in the database.

airflow test [-h] [-sd SUBDIR] [-dr] [-tp TASK_PARAMS] [-pm]
             dag_id task_id execution_date

Positional Arguments

dag_id

The id of the dag

task_id

The id of the task

execution_date

The execution date of the DAG

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’

Default: “[AIRFLOW_HOME]/dags”

-dr, --dry_run

Perform a dry run

Default: False

-tp, --task_params

Sends a JSON params dict to the task

-pm, --post_mortem

Open debugger on uncaught exception

Default: False

webserver

Start a Airflow webserver instance

airflow webserver [-h] [-p PORT] [-w WORKERS]
                  [-k {sync,eventlet,gevent,tornado}] [-t WORKER_TIMEOUT]
                  [-hn HOSTNAME] [--pid [PID]] [-D] [--stdout STDOUT]
                  [--stderr STDERR] [-A ACCESS_LOGFILE] [-E ERROR_LOGFILE]
                  [-l LOG_FILE] [--ssl_cert SSL_CERT] [--ssl_key SSL_KEY] [-d]

Named Arguments

-p, --port

The port on which to run the server

Default: 8080

-w, --workers

Number of workers to run the webserver on

Default: 1

-k, --workerclass

Possible choices: sync, eventlet, gevent, tornado

The worker class to use for Gunicorn

Default: “sync”

-t, --worker_timeout

The timeout for waiting on webserver workers

Default: 120

-hn, --hostname

Set the hostname on which to run the web server

Default: “0.0.0.0”

--pid

PID file location

-D, --daemon

Daemonize instead of running in the foreground

Default: False

--stdout

Redirect stdout to this file

--stderr

Redirect stderr to this file

-A, --access_logfile

The logfile to store the webserver access log. Use ‘-‘ to print to stderr.

Default: “-“

-E, --error_logfile

The logfile to store the webserver error log. Use ‘-‘ to print to stderr.

Default: “-“

-l, --log-file

Location of the log file

--ssl_cert

Path to the SSL certificate for the webserver

--ssl_key

Path to the key to use with the SSL certificate

-d, --debug

Use the server that ships with Flask in debug mode

Default: False

resetdb

Burn down and rebuild the metadata database

airflow resetdb [-h] [-y]

Named Arguments

-y, --yes

Do not prompt to confirm reset. Use with care!

Default: False

upgradedb

Upgrade the metadata database to latest version

airflow upgradedb [-h]

scheduler

Start a scheduler instance

airflow scheduler [-h] [-d DAG_ID] [-sd SUBDIR] [-r RUN_DURATION]
                  [-n NUM_RUNS] [-p] [--pid [PID]] [-D] [--stdout STDOUT]
                  [--stderr STDERR] [-l LOG_FILE]

Named Arguments

-d, --dag_id

The id of the dag to run

-sd, --subdir

File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’

Default: “[AIRFLOW_HOME]/dags”

-r, --run-duration

Set number of seconds to execute before exiting

-n, --num_runs

Set the number of runs to execute before exiting

Default: -1

-p, --do_pickle

Attempt to pickle the DAG object to send over to the workers, instead of letting workers run their version of the code.

Default: False

--pid

PID file location

-D, --daemon

Daemonize instead of running in the foreground

Default: False

--stdout

Redirect stdout to this file

--stderr

Redirect stderr to this file

-l, --log-file

Location of the log file

worker

Start a Celery worker node

airflow worker [-h] [-p] [-q QUEUES] [-c CONCURRENCY] [-cn CELERY_HOSTNAME]
               [--pid [PID]] [-D] [--stdout STDOUT] [--stderr STDERR]
               [-l LOG_FILE] [-a AUTOSCALE]

Named Arguments

-p, --do_pickle

Attempt to pickle the DAG object to send over to the workers, instead of letting workers run their version of the code.

Default: False

-q, --queues

Comma delimited list of queues to serve

Default: “default”

-c, --concurrency

The number of worker processes

Default: 4

-cn, --celery_hostname

Set the hostname of celery worker if you have multiple workers on a single machine.

--pid

PID file location

-D, --daemon

Daemonize instead of running in the foreground

Default: False

--stdout

Redirect stdout to this file

--stderr

Redirect stderr to this file

-l, --log-file

Location of the log file

-a, --autoscale

Minimum and Maximum number of worker to autoscale

flower

Start a Celery Flower

airflow flower [-h] [-hn HOSTNAME] [-p PORT] [-fc FLOWER_CONF] [-u URL_PREFIX]
               [-ba BASIC_AUTH] [-a BROKER_API] [--pid [PID]] [-D]
               [--stdout STDOUT] [--stderr STDERR] [-l LOG_FILE]

Named Arguments

-hn, --hostname

Set the hostname on which to run the server

Default: “0.0.0.0”

-p, --port

The port on which to run the server

Default: 5555

-fc, --flower_conf

Configuration file for flower

-u, --url_prefix

URL prefix for Flower

-ba, --basic_auth

Securing Flower with Basic Authentication. Accepts user:password pairs separated by a comma. Example: flower_basic_auth = user1:password1,user2:password2

-a, --broker_api

Broker api

--pid

PID file location

-D, --daemon

Daemonize instead of running in the foreground

Default: False

--stdout

Redirect stdout to this file

--stderr

Redirect stderr to this file

-l, --log-file

Location of the log file

version

Show the version

airflow version [-h]

connections

List/Add/Delete connections

airflow connections [-h] [-l] [-a] [-d] [--conn_id CONN_ID]
                    [--conn_uri CONN_URI] [--conn_extra CONN_EXTRA]
                    [--conn_type CONN_TYPE] [--conn_host CONN_HOST]
                    [--conn_login CONN_LOGIN] [--conn_password CONN_PASSWORD]
                    [--conn_schema CONN_SCHEMA] [--conn_port CONN_PORT]

Named Arguments

-l, --list

List all connections

Default: False

-a, --add

Add a connection

Default: False

-d, --delete

Delete a connection

Default: False

--conn_id

Connection id, required to add/delete a connection

--conn_uri

Connection URI, required to add a connection without conn_type

--conn_extra

Connection Extra field, optional when adding a connection

--conn_type

Connection type, required to add a connection without conn_uri

--conn_host

Connection host, optional when adding a connection

--conn_login

Connection login, optional when adding a connection

--conn_password

Connection password, optional when adding a connection

--conn_schema

Connection schema, optional when adding a connection

--conn_port

Connection port, optional when adding a connection

create_user

Create an account for the Web UI (FAB-based)

airflow create_user [-h] [-r ROLE] [-u USERNAME] [-e EMAIL] [-f FIRSTNAME]
                    [-l LASTNAME] [-p PASSWORD] [--use_random_password]

Named Arguments

-r, --role

Role of the user. Existing roles include Admin, User, Op, Viewer, and Public

-u, --username

Username of the user

-e, --email

Email of the user

-f, --firstname

First name of the user

-l, --lastname

Last name of the user

-p, --password

Password of the user

--use_random_password

Do not prompt for password. Use random string instead

Default: False

delete_user

Delete an account for the Web UI

airflow delete_user [-h] [-u USERNAME]

Named Arguments

-u, --username

Username of the user

list_users

List accounts for the Web UI

airflow list_users [-h]

sync_perm

Update existing role’s permissions.

airflow sync_perm [-h]

next_execution

Get the next execution datetime of a DAG.

airflow next_execution [-h] [-sd SUBDIR] dag_id

Positional Arguments

dag_id

The id of the dag

Named Arguments

-sd, --subdir

File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’

Default: “[AIRFLOW_HOME]/dags”

rotate_fernet_key

Rotate all encrypted connection credentials and variables; see https://airflow.readthedocs.io/en/stable/howto/secure-connections.html#rotating-encryption-keys.

airflow rotate_fernet_key [-h]