Livy is an open source REST interface for interacting with Spark. pylivy is a Python client for Livy, enabling easy remote code execution on a Spark cluster.


$ pip install -U livy

Note that pylivy requires Python 3.6 or later.


The LivySession class is the main interface provided by pylivy:

from livy import LivySession


with LivySession(LIVY_URL) as session:
    # Run some code on the remote cluster"filtered = df.filter( == 'Bob')")
    # Retrieve the result
    local_df ='filtered')

Authenticate requests sent to Livy by passing any requests Auth object to the LivySession. For example, to perform HTTP basic auth do:

from requests.auth import HTTPBasicAuth

auth = HTTPBasicAuth('username', 'password')

with LivySession(LIVY_URL, auth) as session:"filtered = df.filter( == 'Bob')")
    local_df ='filtered')

API Documenation