Install Python packages yourself

You can install Python packages available in the Python Package Index (PyPi) into your Jupyter environment using the Pip package manager class. The Pip class wraps the pip command with additional configurations to work properly in our environment. It adds authentication and endpoint configuration to get the requested packages without violating security protocols. It also automatically resolves dependencies, installing any packages required by the packages you requested.

You can use the Downloader package manager class to download additional data for the Natural Language Toolkit (NLTK) library that is pre-installed in your Jupyter environment.

Pip

The following sections describe how to install and manage Python packages in your Jupyter environment using the Pip package manager class.

Get a Pip instance

To begin, get an instance of our Pip class:

from fbri.package_managers import Pip
pip = Pip.get_instance()

Install packages

Install a package (or a list of packages) by calling the pip.install function and passing in either a string with the package name (for a single package) or a list of comma-separated strings (for multiple packages).

Some newly installed packages don’t work properly until you restart the kernel.

We recommend that you create a notebook exclusively for installing packages and that you don’t start other notebooks until you're finished installing packages.

When you’re ready, restart the kernel: go to the Kernel menu and click Restart Kernel.

Install a single package

The following example shows how to install a single package:

pip.install("torch") # installing a single package

The system response should look like this:

Writing to /home/jovyan/.config/pip/pip.conf
attempting to install torch...
Executing `python -m pip install  --user torch`
Looking in indexes: https://aws:****@main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/
Collecting torch
  Using cached https://main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/torch/1.10.2/torch-1.10.2-cp39-cp39-manylinux1_x86_64.whl (881.9 MB)
Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.9/site-packages (from torch) (4.1.1)
Installing collected packages: torch
Successfully installed torch-1.10.2
All done
  WARNING: The scripts convert-caffe2-to-onnx, convert-onnx-to-caffe2 and torchrun are installed in '/home/jovyan/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.

Install multiple packages

The following example shows how to install multiple packages:

# installing multiple packages
pip.install([
    "sly",
    "ply",
])

The system response should look like this:

Writing to /home/jovyan/.config/pip/pip.conf
attempting to install sly...
Executing `python -m pip install  --user sly`
Looking in indexes: https://aws:****@main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/
Collecting sly
  Using cached sly-0.4-py3-none-any.whl
Installing collected packages: sly
Successfully installed sly-0.4
attempting to install ply...
Executing `python -m pip install  --user ply`
Looking in indexes: https://aws:****@main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/
Collecting ply
  Using cached https://main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/ply/3.11/ply-3.11-py2.py3-none-any.whl (49 kB)
Installing collected packages: ply
Successfully installed ply-3.11
All done

Install specific versions

You can install specific versions of a package by including one or more version specifiers in the `pip` function as shown in the following example:

pip.install("pyathena==2.4.0") # install pyathena version 2.4.0

The system response should look like this:

Writing to /home/jovyan/.config/pip/pip.conf
attempting to install pyathena==2.4.0...
Executing `python -m pip install  --user pyathena==2.4.0`
Looking in indexes: https://aws:****@main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/
Collecting pyathena==2.4.0
  Using cached https://main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/pyathena/2.4.0/PyAthena-2.4.0-py3-none-any.whl (38 kB)
Requirement already satisfied: boto3>=1.4.4 in /opt/conda/lib/python3.9/site-packages (from pyathena==2.4.0) (1.20.24)
Requirement already satisfied: botocore>=1.5.52 in /opt/conda/lib/python3.9/site-packages (from pyathena==2.4.0) (1.23.24)
Collecting tenacity>=4.1.0
  Using cached https://main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/tenacity/8.0.1/tenacity-8.0.1-py3-none-any.whl (24 kB)
Requirement already satisfied: s3transfer<0.6.0,>=0.5.0 in /opt/conda/lib/python3.9/site-packages (from boto3>=1.4.4->pyathena==2.4.0) (0.5.2)
Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /opt/conda/lib/python3.9/site-packages (from boto3>=1.4.4->pyathena==2.4.0) (0.10.0)
Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /opt/conda/lib/python3.9/site-packages (from botocore>=1.5.52->pyathena==2.4.0) (2.8.2)
Requirement already satisfied: urllib3<1.27,>=1.25.4 in /opt/conda/lib/python3.9/site-packages (from botocore>=1.5.52->pyathena==2.4.0) (1.26.8)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.9/site-packages (from python-dateutil<3.0.0,>=2.1->botocore>=1.5.52->pyathena==2.4.0) (1.16.0)
Installing collected packages: tenacity, pyathena
Successfully installed pyathena-2.4.0 tenacity-8.0.1
All done

Upgrade existing packages

You can upgrade an already-installed package to the latest version available in PyPi by passing the upgrade=True argument to the install method. The effect is the same as running the pip install --upgrade package command.

Only upgrade packages that you have installed.

Attempting to upgrade a system package can break your environment. See Troubleshooting for recovery suggestions, if needed.

pip.install("pyathena", upgrade=True) # upgrade pyathena

The system response should look like this:

Writing to /home/jovyan/.config/pip/pip.conf
attempting to install pyathena...
Executing `python -m pip install --upgrade --user pyathena`
Looking in indexes: https://aws:****@main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/
Requirement already satisfied: pyathena in ./.local/lib/python3.9/site-packages (2.4.0)
Collecting pyathena
  Using cached https://main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/pyathena/2.4.1/PyAthena-2.4.1-py3-none-any.whl (39 kB)
Requirement already satisfied: botocore>=1.5.52 in /opt/conda/lib/python3.9/site-packages (from pyathena) (1.23.24)
Requirement already satisfied: boto3>=1.4.4 in /opt/conda/lib/python3.9/site-packages (from pyathena) (1.20.24)
Requirement already satisfied: tenacity>=4.1.0 in ./.local/lib/python3.9/site-packages (from pyathena) (8.0.1)
Requirement already satisfied: s3transfer<0.6.0,>=0.5.0 in /opt/conda/lib/python3.9/site-packages (from boto3>=1.4.4->pyathena) (0.5.2)
Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /opt/conda/lib/python3.9/site-packages (from boto3>=1.4.4->pyathena) (0.10.0)
Requirement already satisfied: urllib3<1.27,>=1.25.4 in /opt/conda/lib/python3.9/site-packages (from botocore>=1.5.52->pyathena) (1.26.8)
Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /opt/conda/lib/python3.9/site-packages (from botocore>=1.5.52->pyathena) (2.8.2)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.9/site-packages (from python-dateutil<3.0.0,>=2.1->botocore>=1.5.52->pyathena) (1.16.0)
Installing collected packages: pyathena
  Attempting uninstall: pyathena
    Found existing installation: pyathena 2.4.0
    Uninstalling pyathena-2.4.0:
      Successfully uninstalled pyathena-2.4.0
Successfully installed pyathena-2.4.1
All done

Uninstall packages

Uninstall packages by calling the pip.uninstall function. Just like the pip.install function, you can pass in a single string for a single package, or a comma-separated list of strings for a list of packages. Use the pip.list function to see the packages you have installed (see List packages).

Only uninstall packages that you have installed.

Attempting to uninstall system-level packages can lead to unexpected errors. See Troubleshooting for recovery suggestions, if needed.

The following example shows how to uninstall a single package:

pip.uninstall("ply") # uninstall a single package

The system response should look like this:

attempting to install ply...
Executing `python -m pip uninstall --yes ply`
Found existing installation: ply 3.11
Uninstalling ply-3.11:
  Successfully uninstalled ply-3.11
All done

The following example shows how to uninstall multiple packages:

# uninstall multiple packages
pip.uninstall([
  "pyathena",
  "sly",
])

The system response should look like this:

attempting to install pyathena...
Executing `python -m pip uninstall --yes pyathena`
Found existing installation: pyathena 2.4.1
Uninstalling pyathena-2.4.1:
  Successfully uninstalled pyathena-2.4.1
attempting to install sly...
Executing `python -m pip uninstall --yes sly`
Found existing installation: sly 0.4
Uninstalling sly-0.4:
  Successfully uninstalled sly-0.4
All done

List packages

Use the pip.list function to list packages installed by you (default), or all installed packages.

The following example shows the list of packages that you installed:

pip.list() # lists only user packages

The system response should look like this:

Executing `python -m pip list --user`
Package  Version
-------- -------
tenacity 8.0.1
torch    1.10.2

In a previous example, we uninstalled pyathena, but that did not automatically uninstall its dependent package (tenacity). That's why you see tenacity still listed.

The following example includes the user_only=False flag to list all installed packages:

pip.list(user_only=False) # very long output not shown

Purge packages

The pip.purge function allows you to start over with no extra packages installed. The following example removes every package that you installed:

pip.purge()

The system response should look like this:

Executing `rm -rf ~/.local/lib/python*`

The pip.list function should show no results after using the pip.purge function:

pip.list() # displays nothing because all packages have been removed
          
Executing `python -m pip list --user`

Get help

Use the help function to see the documentation for the Pip instance and its methods:

help(pip)

The system response should look like this:

Help on Pip in module fbri.package_managers.pip object:

class Pip(builtins.object)
 |  Pip(repository_name: str, domain_name: str, repository_url_without_scheme: str, codeartifact_api_endpoint: str, region: str, proxy_service_facade: fbri.common.proxy_service_facade.ProxyServiceFacade) -> None
 |  
 |  Wrapper around the `pip` command to handle authentication and configuration to install packages from our private repository.
 |  
 |  Usage:
 |  ```
 |  pip = Pip.get_instance()
 |  
 |  # Install a single package
 |  pip.install("my-package")
 |  
 |  # Install multiple packages
 |  pip.install([
 |      "package1",
 |      "package2==2.1.0", # specifying a version
 |  ])
 |  
 |  # Upgrade a package
 |  pip.install("package-to-upgrade", upgrade=True)
 |  
 |  # List user packages
 |  pip.list()
 |  
 |  # List all packages
 |  pip.list(user_only=False)
 |  
 |  # Uninstall a single package
 |  pip.uninstall("my-package")
 |  
 |  # Uninstall multiple packages
 |  pip.uninstall([
 |      "package1",
 |      "package2",
 |  ])
 |  
 |  # Purge all user packages
 |  pip.purge()
 |  ```
 |  
 |  Methods defined here:
 |  
 |  __init__(self, repository_name: str, domain_name: str, repository_url_without_scheme: str, codeartifact_api_endpoint: str, region: str, proxy_service_facade: fbri.common.proxy_service_facade.ProxyServiceFacade) -> None
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  install(self, package_or_packages: Union[Sequence[str], str], *, upgrade: bool = False) -> None
 |      Installs a single package or a list of packages in the user directory.
 |      
 |      This function takes care of authenticating against our private Python repository and configuring `pip`
 |      to talk to it. It also makes sure that the package installation goes to the user's directory so that
 |      the package doesn't get installed in ephemeral storage and is lost when the user's pod dies.
 |      
 |      Arguments:
 |      :param package_or_packages: A string containing a single package or a list of strings (to install multiple packages).
 |      :param upgrade: `True` if you want pip to run with the `--upgrade` flage (`False` by default).
 |      :raises PipException: When there is an error executing the `pip` command.
 |      
 |      Example:
 |      ```
 |      pip = Pip.get_intance()
 |      
 |      # Install single package
 |      pip.install("my_package")
 |      
 |      # Install multiple packages
 |      pip.install([
 |          "package1",
 |          "package2",
 |          ...
 |      ])
 |      
 |      # Install/ugrade package
 |      pip.install("my_package", upgrade=True)
 |      
 |      # Specify a version
 |      pip.install("my_package==2.2.1")
 |      ```
 |  
 |  list(self, *, user_only: bool = True) -> None
 |      Lists all the packages that are installed and outputs them to stdout. By default, it only outputs the user's packages,
 |      but it can also list system packages.
 |      
 |      Arguments:
 |      :param user_only: `True` to display only user installed packages, `False` to include system packages. Defaults to `True`.
 |      
 |      Example:
 |      ```
 |      pip = Pip.get_intance()
 |      
 |      # List user packages
 |      pip.list()
 |      
 |      # List all packages
 |      pip.list(user_only=False)
 |      ```
 |  
 |  purge(self) -> None
 |      Purges all user packages.
 |      
 |      This command deletes all of the user packages. Use with care.
 |      
 |      Example:
 |      ```
 |      pip = Pip.get_intance()
 |      pip.purge()
 |      ```
 |  
 |  uninstall(self, package_or_packages: Union[Sequence[str], str]) -> None
 |      Uninstalls a single package or list of packages. Currently, it makes no distinction between user's packages
 |      and system packages
 |      
 |      Arguments:
 |      :param package_or_packages: A string containing a single package or a list of strings (to uninstall multiple packages).
 |      :raises PipException: When there is an error executing the `pip` command.
 |      
 |      Example:
 |      ```
 |      pip = Pip.get_intance()
 |      
 |      # Uninstall a single package
 |      pip.uninstall("my-package")
 |      
 |      # Uninstall multiple packages
 |      pip.uninstall([
 |          "package1",
 |          "package2",
 |      ])
 |      ```
 |  
 |  ----------------------------------------------------------------------
 |  Class methods defined here:
 |  
 |  get_instance(repository_name: Optional[str] = None, domain_name: Optional[str] = None, repository_url_without_scheme: Optional[str] = None, codeartifact_api_endpoint: Optional[str] = None, region: str = 'eu-west-1') -> 'Pip' from builtins.type
 |      Builds a singleton instance of `Pip` using the passed in attributes.
 |      
 |      Arguments:
 |      :param repository_name: The name of the private repository.
 |      :param domain_name: The domain of the private repository (specific to CodeArtifact).
 |      :param repository_url_without_scheme: The URL to the repository including the path but without the scheme (i.e. without http:// or https://).
 |      :param codeartifact_api_endpoint: The endpoint to talk to the CodeArtifact API.
 |      :param region: The AWS region. Defaults to "eu-west-1".
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  REFRESH_CREDENTIALS_BEFORE_EXPIRATION_MINUTES = 60
 |  
 |  __annotations__ = {'REFRESH_CREDENTIALS_BEFORE_EXPIRATION_MINUTES': ty...

Troubleshooting

Installing packages carries with it some risk of installing or upgrading dependent packages that are incompatible with one another. If your environment becomes unstable as a result, we recommend the steps below.

Purge your environment

When you purge your environment, you remove all of the packages that you installed, giving your environment a fresh start. Once you do this, restart the affected kernels and you should be back on track. Learn more about how to Purge packages.

Restart your environment

If purging your environment doesn’t solve the problem, it’s likely there's an error in one of the system libraries. Because all system-level changes are ephemeral, we recommend that you restart the environment.

  1. From the File menu, select Hub Control Panel.
  2. Click Stop My Server (you might have to click it multiple times).
  3. Once the server has stopped, restart it again.

Download additional data to enhance Natural Language Toolkit

The Python package Natural Language Toolkit (NLTK) is included in the Jupyter environment by default. NLTK offers a comprehensive collection of datasets and resources such as corpora (like Brown Corpus), grammars, tokenizers, and more to support various natural language processing tasks. The list of available corpora can be found here.

The following example shows downloading all of the additional corpora:

from fbri.package_managers.nltk import Downloader
downloader = Downloader()
# To download the corpus 'brown'
downloader.download('brown')

# To download all the corpora at once
downloader.download('all')