pylivy¶
Livy is an open source REST interface
for interacting with Spark. pylivy
is a
Python client for Livy, enabling easy remote code execution on a Spark cluster.
Usage¶
The LivySession
class is the main interface
provided by pylivy
:
from livy import LivySession
LIVY_URL = 'http://spark.example.com:8998'
with LivySession(LIVY_URL) as session:
# Run some code on the remote cluster
session.run("filtered = df.filter(df.name == 'Bob')")
# Retrieve the result
local_df = session.read('filtered')
Authenticate requests sent to Livy by passing any requests Auth object to the
LivySession
. For example, to perform HTTP basic auth do:
from requests.auth import HTTPBasicAuth
auth = HTTPBasicAuth('username', 'password')
with LivySession(LIVY_URL, auth) as session:
session.run("filtered = df.filter(df.name == 'Bob')")
local_df = session.read('filtered')