{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Download" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Downloading products is after searching one of the most important features of `eodag`. This page describes the different methods available to download products and the parameters that these methods accept." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "Warning\n", "\n", "Downloading products from a provider whose storage is based on AWS may incur some cost.\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Results obtained from *PEPS* after a search of *Sentinel 2 Level-1C* products over France in March 2021 will be loaded in a [SearchResult](../../api_reference/searchresult.rst#eodag.api.search_result.SearchResult). But first, the credentials need to be set in order to be able to download anything." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import os\n", "# os.environ[\"EODAG__PEPS__AUTH__CREDENTIALS__USERNAME\"] = \"PLEASE_CHANGE_ME\"\n", "# os.environ[\"EODAG__PEPS__AUTH__CREDENTIALS__PASSWORD\"] = \"PLEASE_CHANGE_ME\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A workspace directory is created to store the downloaded products." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "workspace = 'eodag_workspace_download'\n", "if not os.path.isdir(workspace):\n", " os.mkdir(workspace)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default `eodag` saves products in the directory set by `outputs_prefix` which is by default the system temporary folder (`/tmp` on Linux) and quicklooks in a `quicklooks `subfolder of `outputs_prefix` (`tmp/quicklooks` on Linux). Here `eodag` is configured to download products in this workspace directory." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "os.environ[\"EODAG__PEPS__DOWNLOAD__OUTPUTS_PREFIX\"] = os.path.abspath(workspace)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another setting that could be defined here is whether or not products need to be automatically extracted from their archive. They are extracted by default, this setting is not going to be altered here. The search result is finally loaded with [deserialize_and_register()](../../api_reference/core.rst#eodag.api.core.EODataAccessGateway.deserialize_and_register)." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "This SearchResult stores 10 S2_MSI_L1C products.\n" ] } ], "source": [ "from eodag import EODataAccessGateway\n", "dag = EODataAccessGateway()\n", "search_results = dag.deserialize_and_register(\"data/download_search_results.geojson\")\n", "print(f\"This SearchResult stores {len(search_results)} {search_results[0].product_type} products.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Logging is set to see more about what `eodag` does when it downloads products." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "from eodag import setup_logging\n", "setup_logging(2)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "nbsphinx": "hidden" }, "outputs": [], "source": [ "# # This code cell has a special metadata entry: \"nbsphinx\": \"hidden\"\n", "# # That hides it when the documentation is built with nbsphinx/sphinx.\n", "\n", "# # Uncomment these lines to regenerate the GeoJSON file used in this notebook.\n", "\n", "# from eodag.api.search_result import SearchResult\n", "\n", "# search_results = dag.search_all(\n", "# productType=\"S2_MSI_L1C\",\n", "# start=\"2018-01-01\",\n", "# end=\"2021-01-01\",\n", "# geom={\"lonmin\": 1, \"latmin\": 45, \"lonmax\": 1.5, \"latmax\": 45.5},\n", "# )\n", "# combined_search_results = SearchResult([])\n", "# offline_prods = [p for p in search_results if p.properties[\"storageStatus\"] == \"OFFLINE\"]\n", "# online_prods = [p for p in search_results if p.properties[\"storageStatus\"] == \"ONLINE\"]\n", "# if len(offline_prods) == 0 or len(online_prods) == 0:\n", "# raise ValueError(\"This search result must contain both ONLINE and OFFLINE products for the #notebook to be run correctly\")\n", "# combined_search_results.extend(online_prods[:5])\n", "# combined_search_results.extend(offline_prods[:5])\n", "# combined_search_results\n", "# dag.serialize(combined_search_results, \"data/download_search_results.geojson\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Progress bar" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`eodag` displays a progress bar every time it downloads products or quicklooks. It uses `tqdm.auto` to create a\n", " progress bar adapted to the context (Notebook, terminal).\n", " \n", "### Cutomize progress bar\n", "\n", "Progress bars can be customized, using the `progress_callback` parameter of the download methods. Create your own instance of [ProgressCallback](../../api_reference/utils.rst#eodag.utils.ProgressCallback) class, customize it to pass it later to [download](./7_download.ipynb#Download-EO-products) methods:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f1fbb8e667ea4f85a22303514788c234", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Eating carrots: 0%| | 0/3 [00:00" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "import matplotlib.image as mpimg\n", "\n", "fig = plt.figure(figsize=(10, 8))\n", "for i, product in enumerate(search_results, start=1):\n", " # This line takes care of downloading the quicklook\n", " quicklook_path = product.get_quicklook()\n", " \n", " # Plot the quicklook\n", " img = mpimg.imread(quicklook_path)\n", " ax = fig.add_subplot(3, 4, i)\n", " ax.set_title(i - 1)\n", " plt.imshow(img)\n", "plt.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Download EO products" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Dynamically configure some download options" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The 3 download methods introduced below accept the following optional kwargs that temporarily override the global configuration:\n", "\n", "* `outputs_prefix` (`str`): absolute path to a folder where the products should be saved\n", "* `extract` (`bool`): whether to automatically extract or not the downloaded product archive\n", "* `dl_url_params` (`dict`): additional parameters to pass over to the download url as an url parameter\n", "* `delete_archive` (`bool`): whether to delete the downloaded archives" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Order OFFLINE products" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As introduced in the [getting started guide](../../getting_started_guide/product_storage_status.rst) an EO product may not be available for download immediately. If the product status is `OFFLINE`, the download methods will request an order of the product and, by default, retry to download it every 2 minutes during 20 minutes. These two durations can be set with the `wait` (in minutes) and `timeout` (in minutes) optional parameters of all the download methods." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The storage status of a product can be obtained from its `storageStatus` field. The status of an `OFFLINE` product is updated by `eodag` to `STAGING` when ordered and to `ONLINE` when found available." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['ONLINE',\n", " 'ONLINE',\n", " 'ONLINE',\n", " 'ONLINE',\n", " 'ONLINE',\n", " 'OFFLINE',\n", " 'OFFLINE',\n", " 'OFFLINE',\n", " 'OFFLINE',\n", " 'OFFLINE']" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[p.properties[\"storageStatus\"] for p in search_results]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A `FilterProperty` can be used to filter out `OFFLINE` products to avoid triggering any product order. " ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2021-05-18 15:53:55,220-15s eodag.plugins.crunch.filter_property [INFO ] Finished filtering products. 5 resulting products\n" ] }, { "data": { "text/plain": [ "['ONLINE', 'ONLINE', 'ONLINE', 'ONLINE', 'ONLINE']" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "online_search_results = search_results.filter_property(\n", " storageStatus=\"ONLINE\"\n", ")\n", "[p.properties[\"storageStatus\"] for p in online_search_results]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Download multiple products at once" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[EODataAccessGateway](../../api_reference/core.rst#eodag.api.core.EODataAccessGateway) offers a [download_all()](../../api_reference/core.rst#eodag.api.core.EODataAccessGateway.download_all) method that takes a [SearchResult](../../api_reference/searchresult.rst#eodag.api.search_result.SearchResult) argument and will try to download each [EOProduct](../../api_reference/eoproduct.rst#eodag.api.product._product.EOProduct) it contains. It returns a list of absolute paths to the downloaded products. For the purpose of this user guide only 2 products will be downloaded." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2021-05-18 15:53:57,558-15s eodag.core [INFO ] Downloading 2 products\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2a46ccd2f87649e6899a26d07aedcad8", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Downloaded products: 0%| | 0/2 [00:00