This repo contains material for a 1-semester course
on Natural Language Processing with Python.
The target audience are students, researchers, developers, hobbyists and anyone interested in knowing more about Natural Language Processing and Text Analytics.
Some very basic knowledge of Python is assumed (e.g. if you have seen some Python script before, you're good to go), but no previous NLP knowledge is required.
The code has been tested with Python 2.7 only on Windows 7 64-bit OS.
Step 1 - Navigate to desktop and clone this repo
-
Open PowerShell by pressing and releasing the keys
Windows
andR
together on the keyboard and release these two keys together. If you have done it right, then aRun
dialog box will open up. Type “powershell” in Run dialog box and click theOK
button. -
In PowerShell type the command
cd c:\users\yourusername\desktop
(ensure to subsitute yourusername) and press the enter key on the keyboard. -
Now type the command
git clone https://github.com/duttashi/learnlp
Step 2- Install Anaconda and iPython Notebook
-
Downloads and install Anaconda from here. Choose Python 2.7 version. Select the default options when prompted during the installation of Anaconda.
-
Launch IPython notebook by typing
jupyter notebook
in PowerShell
Step 3- Check installed libraries versions
-
Click the
new
button on the notebook.import scipy
print('scipy: %s' % scipy.version)
import numpy
print('numpy: %s' % numpy.version)
import matplotlib
print('matplotlib: %s' % matplotlib.version)
import pandas
print('pandas: %s' % pandas.version)
import statsmodels
print('statsmodels: %s' % statsmodels.version)
import sklearn
print('sklearn: %s' % sklearn.version)
You should see output like the following:
scipy: 0.19.0
numpy: 1.12.1
matplotlib: 2.0.2
pandas: 0.20.1
statsmodels: 0.8.0
sklearn: 0.18.1
Step 4- Install Deep Learning Libraries
In this step, we will install Python libraries used for deep learning, specifically: Theano, TensorFlow, and Keras.
NOTE: While installing the deep learning libraries, if you encounter any error, check out the Issues
tab or else search for possible answers on www.stackoverflow.com
website.
- Install the Theano deep learning library by typing:
conda install theano
Confirm your deep learning environment is installed and working correctly by executing the following commands in the ipython notebook
# theano
import theano
print('theano: %s' % theano.__version__)
You should see an output like;
theano: 0.9.0.dev-c697eeab84e5b8a74908da654b66ec9eca4f1291
-
Install Keras by typing:
pip install keras
import keras
print('keras: %s' % keras.version)
Using TensorFlow backend. keras: 2.0.8
-
Install Tensorflow by typing:
activate tensorflow
, your prompt should change. You should see something like,(tensorflow)C:>
. -
To install the CPU-only version of TensorFlow, enter the following command:
(tensorflow)C:> pip install --ignore-installed --upgrade tensorflow
-
To install the GPU version of TensorFlow, enter the following command (on a single line):
(tensorflow)C:> pip install --ignore-installed --upgrade tensorflow-gpu
-
Validate the installation by launching the IPython Notebook. In the notebook type the command,
import tensorflow as tf hello = tf.constant('Hello, TensorFlow!') sess = tf.Session() print(sess.run(hello))
If the system outputs the following, 'Hello, TensorFlow!'
then you are ready to begin writing TensorFlow programs:
Congratulations, you now have a working Python development environment for machine learning.
You can now learn and practice machine learning and deep learning on your workstation.
Please see the folder scripts where you will find ipython notebooks
for further learning.
Here are some interesting questions and answers on StackOverflow. I recommend these should be read in the order, On Statistical knowledge
- 1, On data mining
- 2, 3.
Enjoy and Keep Calm!