fileflow.task_runners package¶

fileflow.task_runners.task_runner module¶

class fileflow.task_runners.task_runner.TaskRunner(context)[source]¶

Bases: object

get_input_filename(data_dependency, dag_id=None)[source]¶

Generate the default input filename for a class.

Parameters:	data_dependency (str) – Key for the target data_dependency in self.data_dependencies that you want to construct a filename for. dag_id (str) – Defaults to the current DAG id
Returns:	File system path or S3 URL to the input file.
Return type:	str

get_output_filename()[source]¶

Generate the default output filename or S3 URL for this task instance.

Returns:	File system path to output filename
Return type:	str

get_upstream_stream(data_dependency_key, dag_id=None)[source]¶

Returns a stream to the file that was output by a seperate task in the same dag.

Parameters:	data_dependency_key (str) – The key (business logic name) for the upstream dependency. This will get the value from the self.data_dependencies dictionary to determine the file to read from. dag_id (str) – Defaults to the current DAG id. encoding (str) – The file encoding to use. Defaults to ‘utf-8’.
Returns:	stream to the file
Return type:	stream

read_upstream_file(data_dependency_key, dag_id=None, encoding='utf-8')[source]¶

Reads the file that was output by a seperate task in the same dag.

Parameters:	data_dependency_key (str) – The key (business logic name) for the upstream dependency. This will get the value from the self.data_dependencies dictionary to determine the file to read from. dag_id (str) – Defaults to the current DAG id. encoding (str) – The file encoding to use. Defaults to ‘utf-8’.
Returns:	Result of reading the file
Return type:	str

read_upstream_json(data_dependency_key, dag_id=None, encoding='utf-8')[source]¶

Reads a json file from upstream into a python object.

Parameters:	data_dependency_key (str) – The key for the upstream data dependency. This will get the value from the self.data_dependencies dict to determine the file to read. dag_id (str) – Defaults to the current DAG id. encoding (str) – The file encoding. Defaults to ‘utf-8’.
Returns:	A python object.

read_upstream_pandas_csv(data_dependency_key, dag_id=None, encoding='utf-8')[source]¶

Reads a csv file from upstream into a pandas DataFrame. Specifically reads a csv into memory as a pandas dataframe in a standard manner. Reads the data in from a file output by a previous task.

Parameters:	data_dependency_key (str) – The key (business logic name) for the upstream dependency. This will get the value from the self.data_dependencies dictionary to determine the file to read from. dag_id (str) – Defaults to the current DAG id. encoding (str) – The file encoding to use. Defaults to ‘utf-8’.
Returns:	The pandas dataframe.
Return type:	`pd.DataFrame`

run(*args, **kwargs)[source]¶

write_file(data, content_type='text/plain')[source]¶

Writes the data out to the correct file.

Parameters:	data (str) – The data to output. content_type (str) – The Content-Type to use. Currently only used by S3.

write_from_stream(stream, content_type='text/plain')[source]¶

write_json(data)[source]¶

Write a python object to a JSON output file.

Parameters:	data (object) – The python object to save.

write_pandas_csv(data)[source]¶

Specifically writes a csv from a pandas dataframe to the default output file in a standard manner.

Parameters:	data – the dataframe to write.

write_timestamp_file()[source]¶: Writes an output file with the current timestamp.