Hadoop2.7.3+Spark2.1.0完全分布式集群搭建过程:https://www.cnblogs.com/zengxiaoliang/p/6478859.html
https://blog.csdn.net/u012804180/article/details/79081424
export PATH=/usr/local/python3/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/python3/lib:$LD_LIBRARY_PATH
centos7 安装teamviewer 报错libQt5WebKitWidgets.so.5()(64bit)
https://blog.csdn.net/kenny_lz/article/details/78884603
https://mirrors.tuna.tsinghua.edu.cn/help/pypi/
pip install somepackage==X.X.X(版本号)
本地文件 file:///home/z/Desktop/README.md
hdfs文件 hdfs://Master:9000/user/README.md
默认是从hdfs读取文件,也可以指定sc.textFile("路径").在路径前面加上hdfs://表示从hdfs文件系统上读
本地文件读取 sc.textFile("路径").在路径前面加上file:// 表示从本地文件系统读,如file:///home/user/spark/README.md
/etc/profile中更改 /spark/conf/spark-env.sh中更改运行时会报版本不同的错误
export PYSPARK_PYTHON=python3
export PYSPARK_DRIVER_PYTHON=ipython
#export PYSPARK_DRIVER_PYTHON_OPTS="notebook"
https://www.cnblogs.com/flying607/p/5730851.html