guangie88 / pyspark-stubs

Apache (Py)Spark type annotations (stub files).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PySpark Stubs

Build Status PyPI version Conda Forge version

A collection of the Apache Spark stub files. These files were generated by stubgen and manually edited to include accurate type hints.

Tests and configuration files have been originally contributed to the Typeshed project. Please refer to its contributors list and license for details.

Motivation

  • Static error detection (see SPARK-20631)

    SPARK-20631

  • Improved completion for chained method calls.

    Syntax completion

Installation and usage

Please note that the guidelines for distribution of type information is still work in progress (PEP 561 - Distributing and Packaging Type Information). Currently installation script overlays existing Spark installations (pyi stub files are copied next to their py counterparts in the PySpark installation directory). If this approach is not acceptable you can add stub files to the search path manually.

According to PEP 484:

Third-party stub packages can use any location for stub storage. Type checkers should search for them using PYTHONPATH.

Moreover:

Third-party stub packages can use any location for stub storage. Type checkers should search for them using PYTHONPATH. A default fallback directory that is always checked is shared/typehints/python3.5/ (or 3.6, etc.)

Please check usage before proceeding.

The package is available on PYPI:

pip install pyspark-stubs

and conda-forge:

conda install -c conda-forge pyspark-stubs

Depending on your environment you might also need a type checker, like Mypy or Pytype.

This package is tested against MyPy development branch and in rare cases (primarily important upstrean bugfixes), is not compatible with the preceding MyPy release.

PySpark Version Compatibility ---------------------

Package versions follow PySpark versions with exception to maintenance releases - i.e. pyspark-stubs==2.3.0 should be compatible with pyspark>=2.3.0,<2.4.0. Maintenance releases (post1, post2, ..., postN) are reserved for internal annotations updates.

API Coverage

Module Dynamically typed Statically typed Notes
pyspark
pyspark.accumulators
pyspark.broadcast Mixed
pyspark.cloudpickle Internal
pyspark.conf
pyspark.context
pyspark.daemon Internal
pyspark.files
pyspark.find_spark_home Internal
pyspark.heapq3 Internal
pyspark.java_gateway Internal
pyspark.join
pyspark.ml
pyspark.ml.base
pyspark.ml.classification
pyspark.ml.clustering
pyspark.ml.common Mixed
pyspark.ml.evaluation
pyspark.ml.feature
pyspark.ml.fpm
pyspark.ml.image
pyspark.ml.linalg
pyspark.ml.param
pyspark.ml.param._shared_params_code_gen Internal
pyspark.ml.param.shared
pyspark.ml.pipeline
pyspark.ml.recommendation
pyspark.ml.regression
pyspark.ml.stat
pyspark.ml.tests Tests
pyspark.ml.tuning
pyspark.ml.util
pyspark.ml.wrapper Mixed
pyspark.mllib
pyspark.mllib.classification
pyspark.mllib.clustering
pyspark.mllib.common
pyspark.mllib.evaluation
pyspark.mllib.feature
pyspark.mllib.fpm
pyspark.mllib.linalg
pyspark.mllib.linalg.distributed
pyspark.mllib.random
pyspark.mllib.recommendation
pyspark.mllib.regression
pyspark.mllib.stat
pyspark.mllib.stat.KernelDensity
pyspark.mllib.stat._statistics
pyspark.mllib.stat.distribution
pyspark.mllib.stat.test
pyspark.mllib.tests Tests
pyspark.mllib.tree
pyspark.mllib.util
pyspark.profiler
pyspark.resourceinformation
pyspark.rdd
pyspark.rddsampler
pyspark.resultiterable
pyspark.serializers
pyspark.shell Internal
pyspark.shuffle Internal
pyspark.sql
pyspark.sql.catalog
pyspark.sql.cogroup
pyspark.sql.column
pyspark.sql.conf
pyspark.sql.context
pyspark.sql.dataframe
pyspark.sql.functions
pyspark.sql.group
pyspark.sql.readwriter
pyspark.sql.session
pyspark.sql.streaming
pyspark.sql.tests Tests
pyspark.sql.types
pyspark.sql.udf
pyspark.sql.utils
pyspark.sql.window
pyspark.statcounter
pyspark.status
pyspark.storagelevel
pyspark.streaming
pyspark.streaming.context
pyspark.streaming.dstream
pyspark.streaming.kinesis
pyspark.streaming.listener
pyspark.streaming.tests Tests
pyspark.streaming.util
pyspark.taskcontext
pyspark.tests Tests
pyspark.traceback_utils Internal
pyspark.util
pyspark.version
pyspark.worker Internal

Disclaimer

Apache Spark, Spark, PySpark, Apache, and the Spark logo are trademarks of The Apache Software Foundation. This project is not owned, endorsed, or sponsored by The Apache Software Foundation.

About

Apache (Py)Spark type annotations (stub files).

License:Apache License 2.0


Languages

Language:Python 100.0%