pylivy¶
Livy is an open source REST interface
for interacting with Spark. pylivy
is a
Python client for Livy, enabling easy remote code execution on a Spark cluster.
Usage¶
The LivySession
class is the main interface
provided by pylivy
:
from livy import LivySession
LIVY_URL = 'http://spark.example.com:8998'
with LivySession(LIVY_URL) as session:
# Run some code on the remote cluster
session.run("filtered = df.filter(df.name == 'Bob')")
# Retrieve the result
local_df = session.read('filtered')
Authenticate requests sent to Livy by passing any requests Auth object to the
LivySession
. For example, to perform HTTP basic auth do:
from requests.auth import HTTPBasicAuth
auth = HTTPBasicAuth('username', 'password')
with LivySession(LIVY_URL, auth) as session:
session.run("filtered = df.filter(df.name == 'Bob')")
local_df = session.read('filtered')
API Documenation¶
livy.session¶
-
class
livy.session.
LivySession
(url, auth=None, kind=<SessionKind.PYSPARK: 'pyspark'>, proxy_user=None, spark_conf=None, echo=True, check=True)[source]¶ Manages a remote Livy session and high-level interactions with it.
- Parameters
url (
str
) – The URL of the Livy server.kind (
SessionKind
) – The kind of session to create.proxy_user (
Optional
[str
]) – User to impersonate when starting the session.spark_conf (
Optional
[Dict
[str
,Any
]]) – Spark configuration properties.echo (
bool
) – Whether to echo output printed in the remote session. Defaults toTrue
.check (
bool
) – Whether to raise an exception when a statement in the remote session fails. Defaults toTrue
.
-
property
state
¶ The state of the managed Spark session.
- Return type
SessionState
-
run
(code)[source]¶ Run some code in the managed Spark session.
- Parameters
code (
str
) – The code to run.- Return type
Output
livy.client¶
-
class
livy.client.
LivyClient
(url, auth=None)[source]¶ A client for sending requests to a Livy server.
- Parameters
url (
str
) – The URL of the Livy server.auth (
Union
[AuthBase
,Tuple
[str
,str
],None
]) – A requests-compatible auth object to use when making requests.
-
legacy_server
()[source]¶ Determine if the server is running a legacy version.
Legacy versions support different session kinds than newer versions of Livy.
- Return type
bool
-
create_session
(kind, proxy_user=None, spark_conf=None)[source]¶ Create a new session in Livy.
- Parameters
kind (
SessionKind
) – The kind of session to create.proxy_user (
Optional
[str
]) – User to impersonate when starting the session.spark_conf (
Optional
[Dict
[str
,Any
]]) – Spark configuration properties.
- Return type
Session
-
get_session
(session_id)[source]¶ Get information about a session.
- Parameters
session_id (
int
) – The ID of the session.- Return type
Optional
[Session
]
-
delete_session
(session_id)[source]¶ Kill a session.
- Parameters
session_id (
int
) – The ID of the session.- Return type
None
-
list_statements
(session_id)[source]¶ Get all the statements in a session.
- Parameters
session_id (
int
) – The ID of the session.- Return type
List
[Statement
]