pylivy

https://travis-ci.org/acroz/pylivy.svg?branch=master https://badge.fury.io/py/livy.svg

Livy is an open source REST interface for interacting with Spark. pylivy is a Python client for Livy, enabling easy remote code execution on a Spark cluster.

Installation

$ pip install -U livy

Note that pylivy requires Python 3.6 or later.

Usage

The LivySession class is the main interface provided by pylivy:

from livy import LivySession

LIVY_URL = 'http://spark.example.com:8998'

with LivySession(LIVY_URL) as session:
    # Run some code on the remote cluster
    session.run("filtered = df.filter(df.name == 'Bob')")
    # Retrieve the result
    local_df = session.read('filtered')

Authenticate requests sent to Livy by passing any requests Auth object to the LivySession. For example, to perform HTTP basic auth do:

from requests.auth import HTTPBasicAuth

auth = HTTPBasicAuth('username', 'password')

with LivySession(LIVY_URL, auth) as session:
    session.run("filtered = df.filter(df.name == 'Bob')")
    local_df = session.read('filtered')

API Documenation