You can install Python packages available in the Python Package Index (PyPi) into your Jupyter environment using the Pip package manager class. The Pip class wraps the pip
command with additional configurations to work properly in our environment. It adds authentication and endpoint configuration to get the requested packages without violating security protocols. It also automatically resolves dependencies, installing any packages required by the packages you requested.
You can use the Downloader package manager class to download additional data for the Natural Language Toolkit (NLTK) library that is pre-installed in your Jupyter environment.
The following sections describe how to install and manage Python packages in your Jupyter environment using the Pip package manager class.
To begin, get an instance of our Pip class:
from fbri.package_managers import Pip pip = Pip.get_instance()
Install a package (or a list of packages) by calling the pip.install
function and passing in either a string with the package name (for a single package) or a list of comma-separated strings (for multiple packages).
Some newly installed packages don’t work properly until you restart the kernel.
We recommend that you create a notebook exclusively for installing packages and that you don’t start other notebooks until you're finished installing packages.
When you’re ready, restart the kernel: go to the Kernel menu and click Restart Kernel.
The following example shows how to install a single package:
pip.install("torch") # installing a single package
The system response should look like this:
Writing to /home/jovyan/.config/pip/pip.conf attempting to install torch... Executing `python -m pip install --user torch` Looking in indexes: https://aws:****@main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/ Collecting torch Using cached https://main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/torch/1.10.2/torch-1.10.2-cp39-cp39-manylinux1_x86_64.whl (881.9 MB) Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.9/site-packages (from torch) (4.1.1) Installing collected packages: torch Successfully installed torch-1.10.2 All done WARNING: The scripts convert-caffe2-to-onnx, convert-onnx-to-caffe2 and torchrun are installed in '/home/jovyan/.local/bin' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
The following example shows how to install multiple packages:
# installing multiple packages pip.install([ "sly", "ply", ])
The system response should look like this:
Writing to /home/jovyan/.config/pip/pip.conf attempting to install sly... Executing `python -m pip install --user sly` Looking in indexes: https://aws:****@main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/ Collecting sly Using cached sly-0.4-py3-none-any.whl Installing collected packages: sly Successfully installed sly-0.4 attempting to install ply... Executing `python -m pip install --user ply` Looking in indexes: https://aws:****@main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/ Collecting ply Using cached https://main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/ply/3.11/ply-3.11-py2.py3-none-any.whl (49 kB) Installing collected packages: ply Successfully installed ply-3.11 All done
You can install specific versions of a package by including one or more version specifiers in the `pip` function as shown in the following example:
pip.install("pyathena==2.4.0") # install pyathena version 2.4.0
The system response should look like this:
Writing to /home/jovyan/.config/pip/pip.conf attempting to install pyathena==2.4.0... Executing `python -m pip install --user pyathena==2.4.0` Looking in indexes: https://aws:****@main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/ Collecting pyathena==2.4.0 Using cached https://main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/pyathena/2.4.0/PyAthena-2.4.0-py3-none-any.whl (38 kB) Requirement already satisfied: boto3>=1.4.4 in /opt/conda/lib/python3.9/site-packages (from pyathena==2.4.0) (1.20.24) Requirement already satisfied: botocore>=1.5.52 in /opt/conda/lib/python3.9/site-packages (from pyathena==2.4.0) (1.23.24) Collecting tenacity>=4.1.0 Using cached https://main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/tenacity/8.0.1/tenacity-8.0.1-py3-none-any.whl (24 kB) Requirement already satisfied: s3transfer<0.6.0,>=0.5.0 in /opt/conda/lib/python3.9/site-packages (from boto3>=1.4.4->pyathena==2.4.0) (0.5.2) Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /opt/conda/lib/python3.9/site-packages (from boto3>=1.4.4->pyathena==2.4.0) (0.10.0) Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /opt/conda/lib/python3.9/site-packages (from botocore>=1.5.52->pyathena==2.4.0) (2.8.2) Requirement already satisfied: urllib3<1.27,>=1.25.4 in /opt/conda/lib/python3.9/site-packages (from botocore>=1.5.52->pyathena==2.4.0) (1.26.8) Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.9/site-packages (from python-dateutil<3.0.0,>=2.1->botocore>=1.5.52->pyathena==2.4.0) (1.16.0) Installing collected packages: tenacity, pyathena Successfully installed pyathena-2.4.0 tenacity-8.0.1 All done
You can upgrade an already-installed package to the latest version available in PyPi by passing the upgrade=True
argument to the install method. The effect is the same as running the pip install --upgrade package
command.
Only upgrade packages that you have installed.
Attempting to upgrade a system package can break your environment. See Troubleshooting for recovery suggestions, if needed.
pip.install("pyathena", upgrade=True) # upgrade pyathena
The system response should look like this:
Writing to /home/jovyan/.config/pip/pip.conf attempting to install pyathena... Executing `python -m pip install --upgrade --user pyathena` Looking in indexes: https://aws:****@main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/ Requirement already satisfied: pyathena in ./.local/lib/python3.9/site-packages (2.4.0) Collecting pyathena Using cached https://main-1234567890.d.codeartifact.eu-west-1.amazonaws.com/pypi/prod_fortapis_python/simple/pyathena/2.4.1/PyAthena-2.4.1-py3-none-any.whl (39 kB) Requirement already satisfied: botocore>=1.5.52 in /opt/conda/lib/python3.9/site-packages (from pyathena) (1.23.24) Requirement already satisfied: boto3>=1.4.4 in /opt/conda/lib/python3.9/site-packages (from pyathena) (1.20.24) Requirement already satisfied: tenacity>=4.1.0 in ./.local/lib/python3.9/site-packages (from pyathena) (8.0.1) Requirement already satisfied: s3transfer<0.6.0,>=0.5.0 in /opt/conda/lib/python3.9/site-packages (from boto3>=1.4.4->pyathena) (0.5.2) Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /opt/conda/lib/python3.9/site-packages (from boto3>=1.4.4->pyathena) (0.10.0) Requirement already satisfied: urllib3<1.27,>=1.25.4 in /opt/conda/lib/python3.9/site-packages (from botocore>=1.5.52->pyathena) (1.26.8) Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /opt/conda/lib/python3.9/site-packages (from botocore>=1.5.52->pyathena) (2.8.2) Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.9/site-packages (from python-dateutil<3.0.0,>=2.1->botocore>=1.5.52->pyathena) (1.16.0) Installing collected packages: pyathena Attempting uninstall: pyathena Found existing installation: pyathena 2.4.0 Uninstalling pyathena-2.4.0: Successfully uninstalled pyathena-2.4.0 Successfully installed pyathena-2.4.1 All done
Uninstall packages by calling the pip.uninstall
function. Just like the pip.install
function, you can pass in a single string for a single package, or a comma-separated list of strings for a list of packages.
Use the pip.list
function to see the packages you have installed (see List packages).
Only uninstall packages that you have installed.
Attempting to uninstall system-level packages can lead to unexpected errors. See Troubleshooting for recovery suggestions, if needed.
The following example shows how to uninstall a single package:
pip.uninstall("ply") # uninstall a single package
The system response should look like this:
attempting to install ply... Executing `python -m pip uninstall --yes ply` Found existing installation: ply 3.11 Uninstalling ply-3.11: Successfully uninstalled ply-3.11 All done
The following example shows how to uninstall multiple packages:
# uninstall multiple packages pip.uninstall([ "pyathena", "sly", ])
The system response should look like this:
attempting to install pyathena... Executing `python -m pip uninstall --yes pyathena` Found existing installation: pyathena 2.4.1 Uninstalling pyathena-2.4.1: Successfully uninstalled pyathena-2.4.1 attempting to install sly... Executing `python -m pip uninstall --yes sly` Found existing installation: sly 0.4 Uninstalling sly-0.4: Successfully uninstalled sly-0.4 All done
Use the pip.list
function to list packages installed by you (default), or all installed packages.
The following example shows the list of packages that you installed:
pip.list() # lists only user packages
The system response should look like this:
Executing `python -m pip list --user` Package Version -------- ------- tenacity 8.0.1 torch 1.10.2
In a previous example, we uninstalled pyathena
, but that did not automatically uninstall its dependent package (tenacity
). That's why you see tenacity
still listed.
The following example includes the user_only=False
flag to list all installed packages:
pip.list(user_only=False) # very long output not shown
The pip.purge
function allows you to start over with no extra packages installed. The following example removes every package that you installed:
pip.purge()
The system response should look like this:
Executing `rm -rf ~/.local/lib/python*`
The pip.list
function should show no results after using the pip.purge
function:
pip.list() # displays nothing because all packages have been removed Executing `python -m pip list --user`
Use the help
function to see the documentation for the Pip
instance and its methods:
help(pip)
The system response should look like this:
Help on Pip in module fbri.package_managers.pip object: class Pip(builtins.object) | Pip(repository_name: str, domain_name: str, repository_url_without_scheme: str, codeartifact_api_endpoint: str, region: str, proxy_service_facade: fbri.common.proxy_service_facade.ProxyServiceFacade) -> None | | Wrapper around the `pip` command to handle authentication and configuration to install packages from our private repository. | | Usage: | ``` | pip = Pip.get_instance() | | # Install a single package | pip.install("my-package") | | # Install multiple packages | pip.install([ | "package1", | "package2==2.1.0", # specifying a version | ]) | | # Upgrade a package | pip.install("package-to-upgrade", upgrade=True) | | # List user packages | pip.list() | | # List all packages | pip.list(user_only=False) | | # Uninstall a single package | pip.uninstall("my-package") | | # Uninstall multiple packages | pip.uninstall([ | "package1", | "package2", | ]) | | # Purge all user packages | pip.purge() | ``` | | Methods defined here: | | __init__(self, repository_name: str, domain_name: str, repository_url_without_scheme: str, codeartifact_api_endpoint: str, region: str, proxy_service_facade: fbri.common.proxy_service_facade.ProxyServiceFacade) -> None | Initialize self. See help(type(self)) for accurate signature. | | install(self, package_or_packages: Union[Sequence[str], str], *, upgrade: bool = False) -> None | Installs a single package or a list of packages in the user directory. | | This function takes care of authenticating against our private Python repository and configuring `pip` | to talk to it. It also makes sure that the package installation goes to the user's directory so that | the package doesn't get installed in ephemeral storage and is lost when the user's pod dies. | | Arguments: | :param package_or_packages: A string containing a single package or a list of strings (to install multiple packages). | :param upgrade: `True` if you want pip to run with the `--upgrade` flage (`False` by default). | :raises PipException: When there is an error executing the `pip` command. | | Example: | ``` | pip = Pip.get_intance() | | # Install single package | pip.install("my_package") | | # Install multiple packages | pip.install([ | "package1", | "package2", | ... | ]) | | # Install/ugrade package | pip.install("my_package", upgrade=True) | | # Specify a version | pip.install("my_package==2.2.1") | ``` | | list(self, *, user_only: bool = True) -> None | Lists all the packages that are installed and outputs them to stdout. By default, it only outputs the user's packages, | but it can also list system packages. | | Arguments: | :param user_only: `True` to display only user installed packages, `False` to include system packages. Defaults to `True`. | | Example: | ``` | pip = Pip.get_intance() | | # List user packages | pip.list() | | # List all packages | pip.list(user_only=False) | ``` | | purge(self) -> None | Purges all user packages. | | This command deletes all of the user packages. Use with care. | | Example: | ``` | pip = Pip.get_intance() | pip.purge() | ``` | | uninstall(self, package_or_packages: Union[Sequence[str], str]) -> None | Uninstalls a single package or list of packages. Currently, it makes no distinction between user's packages | and system packages | | Arguments: | :param package_or_packages: A string containing a single package or a list of strings (to uninstall multiple packages). | :raises PipException: When there is an error executing the `pip` command. | | Example: | ``` | pip = Pip.get_intance() | | # Uninstall a single package | pip.uninstall("my-package") | | # Uninstall multiple packages | pip.uninstall([ | "package1", | "package2", | ]) | ``` | | ---------------------------------------------------------------------- | Class methods defined here: | | get_instance(repository_name: Optional[str] = None, domain_name: Optional[str] = None, repository_url_without_scheme: Optional[str] = None, codeartifact_api_endpoint: Optional[str] = None, region: str = 'eu-west-1') -> 'Pip' from builtins.type | Builds a singleton instance of `Pip` using the passed in attributes. | | Arguments: | :param repository_name: The name of the private repository. | :param domain_name: The domain of the private repository (specific to CodeArtifact). | :param repository_url_without_scheme: The URL to the repository including the path but without the scheme (i.e. without http:// or https://). | :param codeartifact_api_endpoint: The endpoint to talk to the CodeArtifact API. | :param region: The AWS region. Defaults to "eu-west-1". | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined) | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | REFRESH_CREDENTIALS_BEFORE_EXPIRATION_MINUTES = 60 | | __annotations__ = {'REFRESH_CREDENTIALS_BEFORE_EXPIRATION_MINUTES': ty...
Installing packages carries with it some risk of installing or upgrading dependent packages that are incompatible with one another. If your environment becomes unstable as a result, we recommend the steps below.
When you purge your environment, you remove all of the packages that you installed, giving your environment a fresh start. Once you do this, restart the affected kernels and you should be back on track. Learn more about how to Purge packages.
If purging your environment doesn’t solve the problem, it’s likely there's an error in one of the system libraries. Because all system-level changes are ephemeral, we recommend that you restart the environment.
The Python package Natural Language Toolkit (NLTK) is included in the Jupyter environment by default. NLTK offers a comprehensive collection of datasets and resources such as corpora (like Brown Corpus), grammars, tokenizers, and more to support various natural language processing tasks. The list of available corpora can be found here.
The following example shows downloading all of the additional corpora:
from fbri.package_managers.nltk import Downloader downloader = Downloader() # To download the corpus 'brown' downloader.download('brown') # To download all the corpora at once downloader.download('all')