Hoarding programming problems for you!

Looking for good programming challenges?

Use the search below to find our solutions for selected questions!

Install Jupyter Notebook – Mac OS X

Sharing is caring!

Upgrade to Python 3.x

Download and install Python 3.x. For this tutorial I have used 3.5.

Once you downloaded and run the installation app, Python 3 will be installed under:

/Library/Frameworks/Python.framework/Versions/3.5/bin/python3

The installer also adds the path for the above to your default path in .bash_profile so that when you type:

python3

on the command line, the system can find it. You'll know you've been successful if you see the Python interpreter launch.

Install pip

Fire up your Terminal and type:

sudo easy_install pip

Install PySpark on Mac

  1. Go to the Spark downloads page and choose a Spark release. For this tutorial I chose spark-2.0.1-bin-hadoop2.7.
  2. Choose a package type. For this tutorial I have choses Pre-built for Hadoop 2.7 and later.
  3. Choose a download type: (Direct Download)
  4. Download Spark: spark-2.0.1-bin-hadoop2.7.tgz
  5. Unzip the folder in your home directory using the following command. tar -zxvf spark-2.0.1-bin-hadoop2.7.tgz. I prefer create an opt directory in my home directory and then unzip it under ~/opt/.

Next, we will edit our .bash_profile so we can open a spark notebook in any directory. So fire up your Terminal and type in:

nano .bash_profile

my .bash_profile looks as follows:

export SPARK_PATH=~/opt/spark-2.0.1-bin-hadoop2.7/bin
export PYSPARK_PYTHON="python3"
export PYSPARK_DRIVER_PYTHON="jupyter" 
export PYSPARK_DRIVER_PYTHON_OPTS="notebook" 
alias snotebook='$SPARK_PATH/pyspark --master local[2]'
export PATH="$SPARK_PATH:$PATH"

export GRADLE_HOME="/Users/lucas/opt/gradle-2.2.1"
export PATH="$PATH:$GRADLE_HOME/bin"

export ANT_HOME="/Users/lucas/opt/apache-ant-1.9.4"
export PATH="$PATH:$ANT_HOME/bin"

export M2_HOME="/Users/lucas/opt/apache-maven-3.2.5"
export PATH="$PATH:$M2_HOME/bin"
export PATH="/usr/local/mysql/bin:$PATH"

export MONGODB_HOME="/Users/lucas/opt/mongodb-osx-x86_64-3.0.4"
export PATH="$PATH:$MONGODB_HOME/bin"

export JASYPT_HOME="/Users/lucas/opt/jasypt-1.9.2"
export PATH="$PATH:$JASYPT_HOME/bin"

export JAVA_HOME="/Library/Java/JavaVirtualMachines/jdk-9.jdk/Contents/Home"

export PATH="/opt/local/bin:/opt/local/sbin:$PATH"

# Setting PATH for Python 3.5
PATH="/Library/Frameworks/Python.framework/Versions/3.5/bin:${PATH}"

export PATH

The relevant stuff is:

export SPARK_PATH=~/opt/spark-2.0.1-bin-hadoop2.7/bin
export PYSPARK_PYTHON="python3"
export PYSPARK_DRIVER_PYTHON="jupyter" 
export PYSPARK_DRIVER_PYTHON_OPTS="notebook" 
alias snotebook='$SPARK_PATH/pyspark --master local[2]'
export PATH="$SPARK_PATH:$PATH"

The PYSPARK_DRIVER_PYTHON parameter and the PYSPARK_DRIVER_PYTHON_OPTS parameter are used to launch the PySpark shell in Jupyter Notebook. The --master parameter is used for setting the master node address. Here we launch Spark locally on 2 cores for local testing.

Install Jupyter Notebook with pip

First, ensure that you have the latest pip; older versions may have trouble with some dependencies:

pip3 install --upgrade pip

Then install the Jupyter Notebook using:

pip3 install jupyter

Thats it!

You can now run:

pyspark

in the command line. A browser window should open with Jupyter Notebook running under http://localhost:8888/

Configure Jupyter Notebook to show line numbers

Run

jupyter --config-dir

to get the Jupyter config directory. Mine is located under /Users/lucas/.jupyter. Run:

cd /Users/lucas/.jupyter

Run:

mkdir custom

to create a custom directory (if does not already exist). Run:

cd custom

Run:

nano custom.js

and add:

define([
    'base/js/namespace',
    'base/js/events'
    ],
    function(IPython, events) {
        events.on("app_initialized.NotebookApp",
            function () {
                IPython.Cell.options_default.cm_config.lineNumbers = true;
            }
        );
    }
);

You could add any javascript. It will be executed by the ipython notebook at load time.

Install a Java 9 Kernel

Install Java 9. Java home is then:

/Library/Java/JavaVirtualMachines/jdk-9.jdk/Contents/Home

Install kulla.jar. I have installed it under ~/opt/.

Download the kernel. Again, I placed the entire javakernel directory under ~/opt/.

This kernel expects two environment variables defined, which can be set in the kernel.json (described below):

KULLA_HOME - The full path of kulla.jar
JAVA_9_HOME - like JAVA_HOME but pointing to a java 9 environment

So go ahead and edit kernel.json in the kernel you have just download to look as follows:

{
 "argv": ["python3", "/Users/lucas/opt/javakernel",
          "-f", "{connection_file}"],
 "display_name": "Java 9",
 "language": "java",
 "env" : {
     "JAVA_9_HOME": "/Library/Java/JavaVirtualMachines/jdk-9.jdk/Contents/Home",
     "KULLA_HOME": "/Users/lucas/opt/kulla.jar"
     }
}

Run:

cd /usr/local/share/jupyter/kernels/

Run:

mkdir java

Run:

cp /Users/lucas/opt/javakernel/kernel.json java/

to copy the edited kernel.json into the newly created java directory.

Install gnureadline by running:

pip install gnureadline

in the commoand line.

If all worked you should be able to run the kernel:

jupyter console --kernel java

and see the following output:

java version "9-ea"
Java(TM) SE Runtime Environment (build 9-ea+143)
Java HotSpot(TM) 64-Bit Server VM (build 9-ea+143, mixed mode)
Jupyter console 5.0.0