API reference¶
sparklanes¶
-
class
sparklanes.
Branch
(name='UnnamedBranch', run_parallel=False)¶ Bases:
sparklanes.Lane
,object
Branches can be used to split task lanes into branches, which is e.g. useful if part of the data processing pipeline should be executed in parallel, while other parts should be run in subsequent order.
-
class
sparklanes.
Lane
(name='UnnamedLane', run_parallel=False)¶ Bases:
object
Used to build and run data processing lanes (i.e. pipelines). Public methods are chainable.
-
add
(cls_or_branch, *args, **kwargs)[source]¶ Adds a task or branch to the lane.
Parameters: - cls_or_branch (Class) –
- *args – Variable length argument list to be passed to cls_or_branch during instantiation
- **kwargs – Variable length keyword arguments to be passed to cls_or_branch during instantiation
Returns: self
Return type: Returns self to allow method chaining
-
-
class
sparklanes.
SparkContextAndSessionContainer
¶ Bases:
object
Container class holding SparkContext and SparkSession instances, so that any changes will be propagated across the application
-
sc
= None¶
-
classmethod
set_sc
(master=None, appName=None, sparkHome=None, pyFiles=None, environment=None, batchSize=0, serializer=PickleSerializer(), conf=None, gateway=None, jsc=None, profiler_cls=<class 'pyspark.profiler.BasicProfiler'>)[source]¶ Creates and initializes a new SparkContext (the old one will be stopped). Argument signature is copied from pyspark.SparkContext.
-
classmethod
set_spark
(master=None, appName=None, conf=None, hive_support=False)[source]¶ Creates and initializes a new SparkSession. Argument signature is copied from pyspark.sql.SparkSession.
-
spark
= None¶
-
-
sparklanes.
Task
(entry)¶ Decorator with which classes, who act as tasks in a Lane, must be decorated. When a class is being decorated, it becomes a child of LaneTask.
Parameters: entry (The name of the task's "main" method, i.e. the method which is executed when task is run) – Returns: wrapper (function) Return type: The actual decorator function
-
sparklanes.
build_lane_from_yaml
(path)¶ Builds a sparklanes.Lane object from a YAML definition file.
Parameters: path (str) – Path to the YAML definition file Returns: Lane, built according to definition in YAML file Return type: Lane
-
sparklanes.
conn
¶