Skip to content

plpipes.spark#

__dir_to_config(*args, url=False, **kwargs) #

Convert directory path to a string representation.

This function constructs a file path from given arguments and options, resolving the path and converting it to a URL if specified.

Parameters:

Name Type Description Default
*args

Arguments to build the file path.

()
url bool

If True, returns the path as a URL. Defaults to False.

False
**kwargs

Additional keyword arguments for path construction.

{}

Returns:

Name Type Description
str

The constructed file path or URL.

Source code in src\plpipes\spark\__init__.py
def __dir_to_config(*args, url=False, **kwargs):
    """Convert directory path to a string representation.

    This function constructs a file path from given arguments and
    options, resolving the path and converting it to a URL if specified.

    Args:
        *args: Arguments to build the file path.
        url (bool): If True, returns the path as a URL. Defaults to False.
        **kwargs: Additional keyword arguments for path construction.

    Returns:
        str: The constructed file path or URL.
    """
    s = str(fs.path(*args, **kwargs).resolve()).replace("\\", "/")
    if url:
        return f'file://{s}'
    return s

spark_session() #

Retrieve or create a Spark session.

This function checks if a Spark session already exists and returns it. If not, it initializes a new Spark session based on the configuration specified in the 'spark' section of the configuration object.

Returns:

Type Description

Spark session instance.

Source code in src\plpipes\spark\__init__.py
def spark_session():
    """Retrieve or create a Spark session.

    This function checks if a Spark session already exists and returns it.
    If not, it initializes a new Spark session based on the configuration
    specified in the 'spark' section of the configuration object.

    Returns:
        Spark session instance.
    """
    global _spark_session
    if _spark_session is None:
        _spark_session = _init_spark_session()
    return _spark_session