From a32621b60a65e3a9a5ccee34fb5ddac8c6bdbe8b Mon Sep 17 00:00:00 2001 From: Kyle O'Connell Date: Wed, 28 Feb 2024 15:20:16 -0500 Subject: [PATCH 01/25] added prerequisites to each notebook --- tutorials/notebooks/.DS_Store | Bin 0 -> 6148 bytes tutorials/notebooks/GenAI/.DS_Store | Bin 0 -> 6148 bytes ...AzureAIStudio_index_structured_notebook.ipynb | 12 +++++++++--- ...eAIStudio_index_structured_with_console.ipynb | 12 +++++++++--- .../notebooks/AzureAIStudio_langchain.ipynb | 12 +++++++++--- .../GenAI/notebooks/AzureOpenAI_embeddings.ipynb | 12 +++++++++--- .../GenAI/notebooks/Azure_Pubmed_chatbot.ipynb | 10 +++++++++- 7 files changed, 45 insertions(+), 13 deletions(-) create mode 100644 tutorials/notebooks/.DS_Store create mode 100644 tutorials/notebooks/GenAI/.DS_Store diff --git a/tutorials/notebooks/.DS_Store b/tutorials/notebooks/.DS_Store new file mode 100644 index 0000000000000000000000000000000000000000..7223fb49561963c4493ca544d6d90296fe38de66 GIT binary patch literal 6148 zcmeHKQESvd5T3nWZ9*yDgJK^8eJ%9bYHjf)UMo`Y$%sB^Ig=*TV46LWTn~C2%y$?VK#_uH_WAtE&xr@KTwBJ!Y&9UskKg!@?+q~knYpfY1* zG^E#biZ)t^7RNF&z!#oTr8+C!)kkZd=dF8Z(KkJbnz$L)gSh(4PaiN$^2d&ok9js3_MRN;vWfFD zt4u-+Q{X z^XmQZdvRIn8_T%jnX_zIn72P*-CEIARF#D;FW}{jXZVOmRA9Zl;&-6p3RVSgMi+Ex zybZmMVYr_{J?~h z3N1=23JKp1#X45;o&bTYt_+}*mcIKFEG=vOEU$F&yc kDKN}cj99*kH=#!057+>P4qJ=xK;%cj&>)R4@J|`|4fZr%G5`Po literal 0 HcmV?d00001 diff --git a/tutorials/notebooks/GenAI/.DS_Store b/tutorials/notebooks/GenAI/.DS_Store new file mode 100644 index 0000000000000000000000000000000000000000..8570bb25eabde59ce796310942ccf8ce07c6642b GIT binary patch literal 6148 zcmeHK%}T>S5Z-O8CKMqDg&qT53)a7C@e*Qv0V8@)sR;=h8ndNI&7l->))(?gd>&_Z zH)3f8Pa<|E%zm@;lV!gRyIIBz9p!Ll=n)G>tp&Q50V8=abC! z$2Zu!6e0;K+Yheds2EsVXClr0IE}`#APyr)xxJ3lP!z71q+up&Jryua(;Qf>>GZhU zY1>C9y;<9y_V&R(Jm}75rm?-VdwMY#CQqq&QFLKmi17v@3fOp;K(sYF z8cT%`0pY3?P?d6h#o($O{IG?ofgIpcCl+G}ql4d;>5MxHsU-%8fq4e% zs%zo-fByabf4+!%!~ikyuNdHszSnnQO8RVFm>iz9Ht0Pl3g(pxKTE)nM=|8$QQQKR a0)86}Ku2S#5Ii9CBOqy@h8Xx&20j2xmQADp literal 0 HcmV?d00001 diff --git a/tutorials/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb b/tutorials/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb index 3b097d6..8ba1e89 100644 --- a/tutorials/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb +++ b/tutorials/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb @@ -13,9 +13,15 @@ "metadata": {}, "source": [ "## Overview\n", - "LLMs work best when querying vector databases (DBs). In a few of our tutorials in this repo, we have created vector DBs from unstructured data like PDF documents. Here, we create a vector DB from structured data, which is technically complex and requires additional steps. Here we will vectorize (embed) a csv file, index our DB using Azure AI Search, and then query our vector DB using a GPT model deployed within Azure AI Studio.\n", - "\n", - "Note that we assume you have already deployed a model to your AI Studio Environment and have access to your keys and other variables. " + "LLMs work best when querying vector databases (DBs). In a few of our tutorials in this repo, we have created vector DBs from unstructured data like PDF documents. Here, we create a vector DB from structured data, which is technically complex and requires additional steps. Here we will vectorize (embed) a csv file, index our DB using Azure AI Search, and then query our vector DB using a GPT model deployed within Azure AI Studio." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Prerequisites\n", + "We assume you have access to Azure AI Studio and have already deployed an LLM." ] }, { diff --git a/tutorials/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb b/tutorials/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb index 77491d2..a13ae1f 100644 --- a/tutorials/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb +++ b/tutorials/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb @@ -20,9 +20,15 @@ "## Overview\n", "LLMs work best when querying vector databases (DBs). In a few of our tutorials in this repo, we have created vector DBs from unstructured data like PDF documents. Here, we create a vector DB from structured data, which is technically complex and requires additional steps. Here we will vectorize (embed) a csv file, index our DB using Azure AI Search, and then query our vector DB using a GPT model deployed within Azure AI Studio.\n", "\n", - "This notebook differs slightly from the tutorial titled `AzureAIStudio_index_structured_notebook.ipynb` in that here we create the index within Azure AI Search directly, rather than in the notebook. We also use NIH grant data here rather than a Kaggle dataset. \n", - "\n", - "Note that we assume you have already deployed a model to your AI Studio Environment and have access to your keys and other variables. We also assume you have an Azure Search Service and can upload your csv data to create the index through the console." + "This notebook differs slightly from the tutorial titled `AzureAIStudio_index_structured_notebook.ipynb` in that here we create the index within Azure AI Search directly, rather than in the notebook. We also use NIH grant data here rather than a Kaggle dataset. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Prerequisites\n", + "We assume you have access to Azure AI Studio and Azure AI Search Service. We assume you and have already deployed an LLM." ] }, { diff --git a/tutorials/notebooks/GenAI/notebooks/AzureAIStudio_langchain.ipynb b/tutorials/notebooks/GenAI/notebooks/AzureAIStudio_langchain.ipynb index 40b3ef2..04d8ba7 100644 --- a/tutorials/notebooks/GenAI/notebooks/AzureAIStudio_langchain.ipynb +++ b/tutorials/notebooks/GenAI/notebooks/AzureAIStudio_langchain.ipynb @@ -25,9 +25,15 @@ "source": [ "## Overview\n", "Models you deploy to your Azure AI Studio can be accessed via API calls. [Langchain](https://python.langchain.com/docs/get_started/introduction) is a development framework for applications power by language models. \n", - "This tutorial gives you the basics of using langchain to work with Large Language Models (LLMs) for document summarization and basic chat bot functionality. You could take what we have here to build a front end application using something like streamlit, or other further iterations.\n", - "\n", - "We assume you have already deployed a model to your AI Studio Environment and have access to your keys and other variables. " + "This tutorial gives you the basics of using langchain to work with Large Language Models (LLMs) for document summarization and basic chat bot functionality. You could take what we have here to build a front end application using something like streamlit, or other further iterations." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Prerequisites\n", + "We assume you have access to Azure AI Studio and have already deployed an LLM." ] }, { diff --git a/tutorials/notebooks/GenAI/notebooks/AzureOpenAI_embeddings.ipynb b/tutorials/notebooks/GenAI/notebooks/AzureOpenAI_embeddings.ipynb index cd22cae..2b6885e 100644 --- a/tutorials/notebooks/GenAI/notebooks/AzureOpenAI_embeddings.ipynb +++ b/tutorials/notebooks/GenAI/notebooks/AzureOpenAI_embeddings.ipynb @@ -18,9 +18,15 @@ "metadata": {}, "source": [ "## Overview\n", - "Models you deploy to Azure OpenAI can be accessed via API calls. This tutorial gives you the basics of creating local embeddings from custom data and querying over those.\n", - "\n", - "We assume you have already deployed a model to your Azure OpenAI Environment and have access to your keys and other variables. " + "Models you deploy to Azure OpenAI can be accessed via API calls. This tutorial gives you the basics of creating local embeddings from custom data and querying over those." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Prerequisites\n", + "We assume you have access to Azure AI Studio and have already deployed an LLM." ] }, { diff --git a/tutorials/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb b/tutorials/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb index bed2f8e..be462cd 100644 --- a/tutorials/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb +++ b/tutorials/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb @@ -18,7 +18,15 @@ "metadata": {}, "source": [ "## Overview\n", - "[PubMed](https://pubmed.ncbi.nlm.nih.gov/about/) supports the search and retrieval of biomedical and life sciences literature with the aim of improving health–both globally and personally. Here we create a chatbot that is grounded on PubMed data. Most Azure command line tools are already installed and it is recommended to use the **AzureML** kernel in your Jupyter notebook. Here we assume you have already deployed an LLM within Azure AI Studio." + "[PubMed](https://pubmed.ncbi.nlm.nih.gov/about/) supports the search and retrieval of biomedical and life sciences literature with the aim of improving health–both globally and personally. Here we create a chatbot that is grounded on PubMed data. Most Azure command line tools are already installed and it is recommended to use the **AzureML** kernel in your Jupyter notebook." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Prerequisites\n", + "We assume you have access to Azure AI Studio and have already deployed an LLM." ] }, { From a8458b358b42deeff0b70dfbc7510e84d3557587 Mon Sep 17 00:00:00 2001 From: Kyle O'Connell Date: Wed, 28 Feb 2024 15:23:48 -0500 Subject: [PATCH 02/25] updated notebooks --- tutorials/notebooks/.DS_Store | Bin 6148 -> 0 bytes tutorials/notebooks/GenAI/.DS_Store | Bin 6148 -> 0 bytes 2 files changed, 0 insertions(+), 0 deletions(-) delete mode 100644 tutorials/notebooks/.DS_Store delete mode 100644 tutorials/notebooks/GenAI/.DS_Store diff --git a/tutorials/notebooks/.DS_Store b/tutorials/notebooks/.DS_Store deleted file mode 100644 index 7223fb49561963c4493ca544d6d90296fe38de66..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 6148 zcmeHKQESvd5T3nWZ9*yDgJK^8eJ%9bYHjf)UMo`Y$%sB^Ig=*TV46LWTn~C2%y$?VK#_uH_WAtE&xr@KTwBJ!Y&9UskKg!@?+q~knYpfY1* zG^E#biZ)t^7RNF&z!#oTr8+C!)kkZd=dF8Z(KkJbnz$L)gSh(4PaiN$^2d&ok9js3_MRN;vWfFD zt4u-+Q{X z^XmQZdvRIn8_T%jnX_zIn72P*-CEIARF#D;FW}{jXZVOmRA9Zl;&-6p3RVSgMi+Ex zybZmMVYr_{J?~h z3N1=23JKp1#X45;o&bTYt_+}*mcIKFEG=vOEU$F&yc kDKN}cj99*kH=#!057+>P4qJ=xK;%cj&>)R4@J|`|4fZr%G5`Po diff --git a/tutorials/notebooks/GenAI/.DS_Store b/tutorials/notebooks/GenAI/.DS_Store deleted file mode 100644 index 8570bb25eabde59ce796310942ccf8ce07c6642b..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 6148 zcmeHK%}T>S5Z-O8CKMqDg&qT53)a7C@e*Qv0V8@)sR;=h8ndNI&7l->))(?gd>&_Z zH)3f8Pa<|E%zm@;lV!gRyIIBz9p!Ll=n)G>tp&Q50V8=abC! z$2Zu!6e0;K+Yheds2EsVXClr0IE}`#APyr)xxJ3lP!z71q+up&Jryua(;Qf>>GZhU zY1>C9y;<9y_V&R(Jm}75rm?-VdwMY#CQqq&QFLKmi17v@3fOp;K(sYF z8cT%`0pY3?P?d6h#o($O{IG?ofgIpcCl+G}ql4d;>5MxHsU-%8fq4e% zs%zo-fByabf4+!%!~ikyuNdHszSnnQO8RVFm>iz9Ht0Pl3g(pxKTE)nM=|8$QQQKR a0)86}Ku2S#5Ii9CBOqy@h8Xx&20j2xmQADp From 034f07311df73cd02af7a73d1a756efe89844a8a Mon Sep 17 00:00:00 2001 From: Kyle O'Connell Date: Thu, 29 Feb 2024 13:22:22 -0500 Subject: [PATCH 03/25] reformatted to put tutorials up frong --- README.md | 177 ++++++++---------- .../CycleCloud_CustomRole.json | 0 .../GWAS/GWAS_coat_color.ipynb | 0 .../GenAI/Azure_AI_Studio_README.md | 0 .../GenAI/Azure_Open_AI_README.md | 0 .../notebooks => notebooks}/GenAI/LICENSE | 0 .../GenAI/embedding_demos/acs_embeddings.py | 0 .../GenAI/embedding_demos/aoai_embeddings.py | 0 ...ample_azureaisearch_openaichat_zeroshot.py | 0 .../example_langchain_openaichat_zeroshot.py | 0 .../example_scripts/workshop_embedding.py | 0 .../GenAI/example_scripts/workshop_search.py | 0 .../GenAI/microsoft-earnings.csv | 0 ...reAIStudio_index_structured_notebook.ipynb | 0 ...Studio_index_structured_with_console.ipynb | 0 .../notebooks/AzureAIStudio_langchain.ipynb | 0 .../notebooks/AzureOpenAI_embeddings.ipynb | 0 .../notebooks/Azure_Pubmed_chatbot.ipynb | 0 .../GenAI/requirements.txt | 0 .../Hurricane_Irene_(2005).pdf | Bin .../search_documents/Koutros_et_al_2023.pdf | Bin .../New_York_State_Route_373.pdf | Bin .../GenAI/search_documents/Rai_et_al_2023.pdf | Bin .../search_documents/Silverman_et_al_2023.pdf | Bin .../aoai_workshop_content.pdf | Bin .../search_documents/grant_data_sub1.txt | 0 .../search_documents/grant_data_sub2.txt | 0 .../SRADownload/SRA-Download.ipynb | 0 .../SpleenLiverSegmentation/README.md | 0 .../SpleenSeg_Pretrained-4_27.ipynb | 0 .../Spleen_best_metric_model_pretrained.pth | Bin .../pangolin/pangolin.yaml | 0 .../pangolin/pangolin_pipeline.ipynb | 0 .../rnaseq-myco-tutorial-main/LICENSE | 0 .../rnaseq-myco-tutorial-main/README.md | 0 .../RNAseq_pipeline.ipynb | 0 .../images/count-workflow.png | Bin .../images/rnaseq-workflow.png | Bin .../images/table-cushman.png | Bin tutorials/README.md | 85 --------- 40 files changed, 81 insertions(+), 181 deletions(-) rename {tutorials/CycleCloud => envs}/CycleCloud_CustomRole.json (100%) rename {tutorials/notebooks => notebooks}/GWAS/GWAS_coat_color.ipynb (100%) rename {tutorials/notebooks => notebooks}/GenAI/Azure_AI_Studio_README.md (100%) rename {tutorials/notebooks => notebooks}/GenAI/Azure_Open_AI_README.md (100%) rename {tutorials/notebooks => notebooks}/GenAI/LICENSE (100%) rename {tutorials/notebooks => notebooks}/GenAI/embedding_demos/acs_embeddings.py (100%) rename {tutorials/notebooks => notebooks}/GenAI/embedding_demos/aoai_embeddings.py (100%) rename {tutorials/notebooks => notebooks}/GenAI/example_scripts/example_azureaisearch_openaichat_zeroshot.py (100%) rename {tutorials/notebooks => notebooks}/GenAI/example_scripts/example_langchain_openaichat_zeroshot.py (100%) rename {tutorials/notebooks => notebooks}/GenAI/example_scripts/workshop_embedding.py (100%) rename {tutorials/notebooks => notebooks}/GenAI/example_scripts/workshop_search.py (100%) rename {tutorials/notebooks => notebooks}/GenAI/microsoft-earnings.csv (100%) rename {tutorials/notebooks => notebooks}/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb (100%) rename {tutorials/notebooks => notebooks}/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb (100%) rename {tutorials/notebooks => notebooks}/GenAI/notebooks/AzureAIStudio_langchain.ipynb (100%) rename {tutorials/notebooks => notebooks}/GenAI/notebooks/AzureOpenAI_embeddings.ipynb (100%) rename {tutorials/notebooks => notebooks}/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb (100%) rename {tutorials/notebooks => notebooks}/GenAI/requirements.txt (100%) rename {tutorials/notebooks => notebooks}/GenAI/search_documents/Hurricane_Irene_(2005).pdf (100%) rename {tutorials/notebooks => notebooks}/GenAI/search_documents/Koutros_et_al_2023.pdf (100%) rename {tutorials/notebooks => notebooks}/GenAI/search_documents/New_York_State_Route_373.pdf (100%) rename {tutorials/notebooks => notebooks}/GenAI/search_documents/Rai_et_al_2023.pdf (100%) rename {tutorials/notebooks => notebooks}/GenAI/search_documents/Silverman_et_al_2023.pdf (100%) rename {tutorials/notebooks => notebooks}/GenAI/search_documents/aoai_workshop_content.pdf (100%) rename {tutorials/notebooks => notebooks}/GenAI/search_documents/grant_data_sub1.txt (100%) rename {tutorials/notebooks => notebooks}/GenAI/search_documents/grant_data_sub2.txt (100%) rename {tutorials/notebooks => notebooks}/SRADownload/SRA-Download.ipynb (100%) rename {tutorials/notebooks => notebooks}/SpleenLiverSegmentation/README.md (100%) rename {tutorials/notebooks => notebooks}/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb (100%) rename {tutorials/notebooks => notebooks}/SpleenLiverSegmentation/monai_data/Spleen_best_metric_model_pretrained.pth (100%) rename {tutorials/notebooks => notebooks}/pangolin/pangolin.yaml (100%) rename {tutorials/notebooks => notebooks}/pangolin/pangolin_pipeline.ipynb (100%) rename {tutorials/notebooks => notebooks}/rnaseq-myco-tutorial-main/LICENSE (100%) rename {tutorials/notebooks => notebooks}/rnaseq-myco-tutorial-main/README.md (100%) rename {tutorials/notebooks => notebooks}/rnaseq-myco-tutorial-main/RNAseq_pipeline.ipynb (100%) rename {tutorials/notebooks => notebooks}/rnaseq-myco-tutorial-main/images/count-workflow.png (100%) rename {tutorials/notebooks => notebooks}/rnaseq-myco-tutorial-main/images/rnaseq-workflow.png (100%) rename {tutorials/notebooks => notebooks}/rnaseq-myco-tutorial-main/images/table-cushman.png (100%) delete mode 100644 tutorials/README.md diff --git a/README.md b/README.md index 9f63a79..3a44017 100644 --- a/README.md +++ b/README.md @@ -1,100 +1,85 @@ +# Microsoft Azure Tutorial Resources - ->This repository falls under the NIH STRIDES Initiative. STRIDES aims to harness the power of the cloud to accelerate biomedical discoveries. To learn more, visit https://cloud.nih.gov. - -# NIH Cloud Lab for Azure --------------------------------- -NIH Cloud Lab’s goal is to make Cloud easy and accessible for you, so that you can spend less time on administrative tasks and focus more on research. - -Use this repository to learn about how to use Azure by exploring the linked resources and walking through the tutorials. If you are a beginner, we suggest you begin with this jumpstart section. If you already have foundational knowledge of Azure and Cloud, feel free to skip ahead to the [tutorials](/tutorials/) section for in-depth examples of how to run specific workflows such as genomic variant calling and medical image analysis. - ## Overview of Page Contents -+ [Getting Started](#gs) -+ [Overview](#ov) -+ [Resource Groups](#rg) -+ [Command Line Tools](#cli) -+ [Azure Marketplace](#mark) -+ [Ingest and Store Data](#sto) -+ [Virtual Machines](#vm) -+ [Azure Functions](#vm) -+ [Disk Images](#disk) -+ [Azure Machine Learning](#sag) -+ [Clusters](#clu) -+ [Creating a Conda Environment](#co) -+ [Azure Container Registry](#con) -+ [GitHub](#gh) -+ [Billing and Benchmarking](#bb) -+ [Cost Optimization](#cost) -+ [Getting Support](#sup) -+ [Additional Training](#tr) - -## **Getting Started** -You can learn a lot of what is possible on Azure in the Azure Getting Started [Tutorials Page](https://azure.microsoft.com/en-us/get-started/) and we recommend you go there and explore some of the tutorials on offer. Nonetheless, it can be hard to know where to start if you are new to the cloud. To help you, we thought through some of the most common tasks you will encounter doing cloud-enabled research, and gathered tutorials and guides specific to those topics. We hope the following materials are helpful as you explore using Azure! - -## **Overview** -There are three primary ways you can run analyses using Azure: using **Virtual Machines**, **Jupyter Notebook instances**, and **Managed services**. We give a brief overview of each of these here and go into more detail in the sections below. [Virtual Machines](https://azure.microsoft.com/en-us/products/virtual-machines/) are like desktop computers, but you access them through the cloud console and you get to pick the operating system and the specifications such as CPU and memory. In Azure, these virtual machines are called VMs for short. Jupyter Notebook instances are virtual machines with a preconfigured Jupyter Lab. On Azure these are run through [Azure Machine Learning](https://azure.microsoft.com/en-us/products/machine-learning/#product-overview), which is also Azure's ML/AI platform. You decide what kind of virtual machine you want to 'spin up' and then you can run Juptyer notebooks on those virtual machines. Finally, Serverless services are services that allow you to run things, an analysis, an app, a website, and not have to deal with your own servers (VMs). There are still servers running somewhere, you just don't have to manage them. All you have to do is call a command that runs your analysis in the background, and copies the output files to a storage account. [Azure Batch](https://learn.microsoft.com/en-us/azure/batch/batch-technical-overview) is a common example. - -## **Resource Groups** -A resource group is a container that holds related resources for an Azure solution. The resource group can include all the resources for the solution, or only those resources that you want to manage as a group. You decide how you want to allocate resources to resource groups based on what makes the most sense for your use case. Generally, add resources that share the same lifecycle to the same resource group so you can easily deploy, update, and delete them as a group. Each resource group stores metadata about the underlying resources. Therefore, when you specify a location for the resource group, you are specifying where that metadata is stored. For compliance reasons, you may need to ensure that your data is stored in a particular region. - -To see more information on how to manage resource groups, visit our docs about [Managing Resource Groups](/docs/resource_groups.md). - -## **Command Line Tools** -Most tasks in Azure can be done without the command line, but the command line tools will generally make your life easier in the long run. Command line interface (CLI) tools are those that you use directly in a terminal/shell as opposed to clicking within the Azure portal's graphical user interface (GUI). The primary tool you will need is the Azure CLI, which will allow you to interact with Virtual Machines (VMs) or Storage Accounts (see below) from your local terminal. Instructions for the CLI can be found [here](https://learn.microsoft.com/en-us/cli/azure/). If you are unable to install locally, you can use all the CLI commands from within VM and Machine Learning instances, or from the [Cloud Shell](https://learn.microsoft.com/en-us/azure/cloud-shell/overview). - -To install and configure Azure CLI, redirect to [Get started with Azure CLI](https://learn.microsoft.com/en-us/cli/azure/get-started-with-azure-cli), which provides detailed instructions on installation as well as documentation on common Azure CLI commands. Microsoft Azure also has a cloud native service called [Microsoft Genomics](https://www.microsoft.com/en-us/genomics/) which offers cloud implementation of the Burrows-Wheeler Aligner (BWA) and the Genome Analysis Toolkit (GATK) for secondary analysis. Find documentation on how to use Microsoft Genomics [here](https://learn.microsoft.com/en-us/azure/genomics/overview-what-is-genomics). - -## **Azure Marketplace** -The [Microsoft Azure Marketplace](https://azuremarketplace.microsoft.com/en-us/marketplace/) is an online store in Azure that contains thousands of software applications and services to fit your research needs. For example, you can find VMs configured for Microsoft Genomics or NVIDIA machine learning. Within Cloud Lab, the most common use case for the Marketplace will likely be [CycleCloud](https://learn.microsoft.com/en-us/azure/cyclecloud/tutorials/tutorial?view=cyclecloud-8), which is Azure's High Performance Computing solution. If interested in CycleCloud, please contact us at `CloudLab@nih.gov` so we can help set this up in your Cloud Lab account. - -## **Ingest and Store Data using Azure Storage Accounts** -Microsoft's object storage solution for the cloud is called Azure Blob. Blob is optimized for storing massive amounts of unstructured data. Azure also offers many other storage solutions listed [here](https://azure.microsoft.com/en-us/products/category/storage/). To get started you must create a [Storage Account](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-create?tabs=azure-portal). Users can grant limited access to Azure storage resources using [Shared Access Signatures](https://learn.microsoft.com/en-us/azure/storage/common/storage-sas-overview)(SAS). You can also read our guide to Storage Accounts and moving data in and out of Cloud Lab [here](/docs/create_storage_account.md). This [Microsoft guide](https://microsoft.github.io/Genomics-Community/mydoc_data_migration.html) for moving genomic data is also very helpful. - -## **Virtual Machines** -Virtual machines (VMs) on Azure can be accessed via SSH or from the Azure portal. More information on VMs can be found [here](https://azure.microsoft.com/en-us/products/virtual-machines/#overview) as well as this [guide](https://learn.microsoft.com/en-us/azure/virtual-machines/linux/ssh-from-windows) on how to use SSH keys with windows in Azure. To view the different types of VMs available in Azure check out the [Virtual Machine Series](https://azure.microsoft.com/en-us/pricing/details/virtual-machines/series/). - -You can also spin up preconfigured VMs, such as the Azure Data Science VM, which has many data science tools preinstalled and may save you time on environment set up. Read more in [our docs](/docs/Azure_Data_Science_VMs.md). - -Also, for best VM provisioning experience, please see this link for VM best practices in [our docs](/docs/Virtual-machine-best-practices.md). - -## **Azure Functions** -Azure Functions is a serverless solution that allows you to write less code, maintain less infrastructure, and save on costs. Instead of worrying about deploying and maintaining servers, the cloud infrastructure provides all the up-to-date resources needed to keep your applications running. For more information click [here](https://learn.microsoft.com/en-us/azure/azure-functions/). In general, you can consider functions for automating workflows. - -## **Disk Images** -Part of the power of virtual machines is that they offer a blank slate for you to configure as desired. [Azure VM Image Builder](https://azure.microsoft.com/en-us/products/image-builder/#overview) simplifies the image building process allowing for custom built images to be saved. You can later redeploy these images to spin up a new machine with data or environments already installed. - -## **Launch a Machine Learning Workspace (Jupyter Environment)** -[Azure Machine Learning studio](https://learn.microsoft.com/en-us/azure/machine-learning/overview-what-is-azure-machine-learning) is Azure's ML/AI solution. ML studio allows for you to run your own code in managed Jupyter notebooks. Follow the [Quickstart](https://learn.microsoft.com/en-us/azure/machine-learning/quickstart-run-notebooks) page to begin running Jupyter Notebooks in studio. Note that you will need to start and stop your compute environment, which is run separately from the notebook. Once in the AzureML portal, go to compute, then you can select Jupyter, Notebooks, or VS Code, which means a lot of flexibility in the way you utilize the compute environment. - -The Azure file share account of your Azure Machine Learning workspace is mounted as a drive on the compute instance. This drive is the default working directory for Jupyter, Jupyter Labs, RStudio, and Posit Workbench. This means that the notebooks and other files you create in Jupyter, JupyterLab, RStudio, or Posit are automatically stored on the file share and available to use in other compute instances as well. - -If you are running complex ML models, look at this Microsoft [blog post](https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/azureml-observability-a-scalable-and-extensible-solution-for-ml/ba-p/3474066) for an overview of Microsoft's overvability solution. The source code is [here](https://github.com/microsoft/AzureML-Observability). - -## **Clusters** -One great thing about the cloud is its ability to scale with demand. When you submit a job to a traditional cluster, you specify up front how many CPUs and memory you want to give to your job, and you may over- or under-utilize these resources. With managed resources like serverless and clusters you can leverage a feature called autoscaling, where the compute resources will scale up or down with demand. This is more efficient and keeps costs down when demand is low, but prevents latency when demand is high (think about workshop participants all submitting jobs at the same time to a cluster). For most users of Cloud Lab, the best way to leverage scaling is to use Azure Batch, but in some cases, maybe for a whole lab group or large project, it may make sense to spin up a [Kubernetes cluster](https://azure.microsoft.com/en-us/products/kubernetes-service/). - -If you are interested in using a more traditional scheduler like SLURM or Sun Grid Engine, you can use Azure CycleCloud, which has an easy to use GUI as well as CLI options. If interested in CycleCloud, please contact us at `CloudLab@nih.gov` and we will provision a CycleCloud instance for you. - -## **Creating a Conda Environment** -Virtual environments allow you to manage package versions without having package conflicts. For example, if you needed Python 3 for one analysis, but Python 2.7 for another, you could create separate environments to use the two versions of Python. One of the most popular package managers used for creating virtual environments is the [conda package manager](https://docs.conda.io/projects/conda/en/latest/user-guide/concepts/environments.html#:~:text=A%20conda%20environment%20is%20a,NumPy%201.6%20for%20legacy%20testing). We also made a quick guide that you can reference [here](/docs/create_conda_env.md) - -## **Managing Containers with Azure Container Registry** -You can host or pull containers with Azure Container Registry. See [Microsoft's documentation](https://learn.microsoft.com/en-us/azure/container-registry/container-registry-get-started-portal?tabs=azure-cli) on how to use this service. - -## **GitHub** -GitHub is a code hosting platform for version control and collaboration. It lets you and others work together on projects from anywhere. This [tutorial](https://docs.github.com/en/get-started/quickstart/hello-world) teaches you GitHub essentials like repositories, branches, commits, and pull requests. You'll create your own Hello World repository and learn GitHub's pull request workflow, a popular way to create and review code. Since Microsoft owns GitHub, it integrates nicely with Azure. - -## **Billing and Benchmarking** -Many Cloud Lab users are interested in understanding how to estimate the price of a large-scale project using a reduced sample size. Generally, you should be able to benchmark with a few representative samples to get an idea of time and cost required for a larger scale project. Follow our [Cost Management Guide](/docs/billing_and_cost_management.md) to see how to tag specific resources for workflow benchmarking. - -In terms of cost, the best way to estimate costs is to use the Azure pricing calculator [here](https://azure.microsoft.com/en-us/pricing/calculator/) for an initial figure, which is a pricing tool that forecasts costs based on products and usage. Then, you can run some benchmarks and double check that everything is acting as you expect. See [our docs](/docs/Using_The_Azure_Price_Calculator.md) on best practices for using this tool. - -## **Cost Optimization** -Follow our [Cost Management Guide](/docs/billing_and_cost_management.md) for details on how to monitor costs, set up budget alerts, and cost-benchmark specific analyses using resource tagging. In addition, here are a few tips to help you stay on budget. You can also configure auto-shutdown on your VM instances following [this guide](/docs/auto-shutdown-instance.md) to prevent you from accidentally leaving instances running. - -## **Getting Support** -As part of your participation in Cloud Lab you will be added to the Cloud Lab Teams channel where you can chat with other Cloud Lab users, and gain support from the Cloud Lab team. For NIH Intramural users, you can submit a support ticket to Service Now. For issues related to the cloud environment, feel free to request [Azure Enterprise Support](/docs/request_enterprise_support.md). For issues related to scientific use cases, such as, `how can I best run an RNAseq pipeline in Azure?`, email us at `CloudLab@nih.gov`. - -## **Additional Training** -This repo only scratches the surface of what can be done in the cloud. If you are interested in additional cloud training opportunities, please visit the [STRIDES Training page](https://cloud.nih.gov/training/). For more information on the STRIDES Initiative at the NIH, visit [our website](https://cloud.nih.gov) or contact the NIH STRIDES team at STRIDES@nih.gov for more information. ++ [Artificial Intelligence](#ai) ++ [Clinical Informatics](#ci) ++ [Medical Imaging](#mi) ++ [Genomics on Azure](#bio) ++ [GWAS](#gwas) ++ [BLAST](#blast) ++ [VCF Query](#vcf) ++ [RNAseq](#rna) ++ [scRNAseq](#sc) ++ [Long Read Sequencing Analysis](#long) ++ [Open Data](#open) + +## **Artificial Intelligence** +Machine learning is a subfield of artificial intelligence that focuses on the development of algorithms and models that enable computers to learn from and make predictions or decisions based on data, without being explicitly programmed. Artificial intelligence and machine learning algorithms are being applied to a variety of biomedical research questions, ranging from image classification to genomic variant calling. Azure offers AI services through Azure AI Studio and Azure Machine Learning. + +See our suite of tutorials to learn more about [Gen AI on Azure](/tutorials/notebooks/GenAI/) that highlight Azure products such as [Azure AI Studio](/tutorials/notebooks/GenAI/Azure_AI_Studio_README.md), [Azure OpenAI](/tutorials/notebooks/GenAI/Azure_Open_AI_README.md) and [Azure AI Search](/tutorials/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb) and external tools like [Langchain](/tutorials/notebooks/GenAI/notebooks/AzureOpenAI-langchain.ipynb). These notebooks walk you through how to deploy, train, and query models, as well as how to implement techniques like [Retrieval-Augmented Generation (RAG)](/tutorials/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb). If you are interested in applying RAG to structured data like a csv file, we created one tutorials that walks you through how to index your csv using the [Azure UI](/docs/create_index_from_csv.md) and a query your database using a [notebook within Azure ML](/tutorials/notebooks/GenAI/notebooks/llm_query_csv.ipynb), and [one tutorial that runs all the necessary steps directly from a notebook](/tutorials/notebooks/GenAI/notebooks/azure_ai_search_structured.ipynb). + + ## **Clinical Informatics with FHIR** + Azure Health Data Services is a set of services that enables you to store, process, and analyze medical data in Azure. These services are designed to help organizations quickly connect disparate health data sources and formats, such as structured, imaging, and device data, and normalize it to be persisted in the cloud. At its core, Azure Health Data Services possesses the ability to transform and ingest data into FHIR (Fast Healthcare Interoperability Resources) format. This allows you to transform health data from legacy formats, such as HL7v2 or CDA, or from high-frequency IoT data in device proprietary formats to FHIR. This makes it easier to connect data stored in Azure Health Data Services with services across the Azure ecosystem, like Azure Synapse Analytics, and Azure Machine Learning (Azure ML). + +Azure Health Data Services includes support for multiple health data standards for the exchange of structured data, and the ability to deploy multiple instances of different service types (FHIR, DICOM, and MedTech) that seamlessly work with one another. Services deployed within a workspace also share a compliance boundary and common configuration settings. The product scales automatically to meet the varying demands of your workloads, so you spend less time managing infrastructure and more time generating insights from health data. + +Copying healthcare data stored in Azure FHIR Server to Synapse Analytics allows researchers to leverage a cloud-scale data warehousing and analytics tool to extract insights from their data as well as build scalable research pipelines. +For information on how to perform this export and downstream analytics, please visit [this repository](https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/healthcare-apis/fhir/copy-to-synapse.md). + +You can also see hands-on examples of using [FHIR on Azure](https://github.com/microsoft/genomicsnotebook/tree/main/fhirgenomics), but note that you will need to supply your own VCF files as these are not provided with the tutorial content. + +## **Medical Imaging Analysis** +Medical imaging analysis requires the analysis of large image files and often requires elastic storage and accelerated computing. Microsoft Azure offers cloud-based medical imaging analysis capabilities through its Azure Healthcare APIs and Azure Medical Imaging solutions. Azure's DICOM Service allows for the secure storage, management, and processing of medical images in the cloud, using industry standard DICOM (Digital Imaging and Communications in Medicine) format. The DICOM Service provides features like high availability, disaster recovery, and scalable storage options, making it an ideal solution for pipelines that need to store, manage, and analyze large amounts of medical imaging data. In addition, the server integrates with other Azure services like Azure ML, facilitating the use of advanced machine learning algorithms for image analysis tasks such as object detection, segmentation, and classification. Read about how to deploy the service [here](https://learn.microsoft.com/en-us/azure/healthcare-apis/dicom/deploy-dicom-services-in-azure). + +Microsoft has several medical imaging notebooks that showcase different medical imaging use-cases on Azure Machine Learning. These notebooks demonstrate various data science techniques such as manual model development with PyTorch, automated machine learning, and MLOPS-based examples for automating the machine learning lifecycle in medical use cases, including retraining. +These notebooks are available [here](https://github.com/Azure/medical-imaging). Make sure you select a kernel that includes Pytorch else the install of dependencies can be challenging. Note also that you need to use a GPU VM for most of the notebook cells, but you can create several compute environments and switch between them as needed. Be sure to shut them off when you are finished. + +For Cloud Lab users interested in multi-modal clinical informatics, DICOMcast provides the ability to synchronize data from a DICOM service to a FHIR service, allowing users to integrate clinical and imaging data. DICOMcast expands the use cases for health data by supporting both a streamlined view of longitudinal patient data and the ability to effectively create cohorts for medical studies, analytics, and machine learning. For more information on how to utilize DICOMcast please visit Microsoft’s [documentation](https://learn.microsoft.com/en-us/azure/healthcare-apis/dicom/dicom-cast-overview) or the open-source [GitHub repository](https://github.com/microsoft/dicom-server/blob/main/docs/quickstarts/deploy-dicom-cast.md). + +For users hoping to train deep learning models on imaging data, InnerEye-DeepLearning (IE-DL) is a toolbox that Microsoft developed for easily training deep learning models on 3D medical images. Simple to run both locally and in the cloud with Azure Machine Learning, it allows users to train and run inference on the following: +• Segmentation models +• Classification and regression models +• Any PyTorch Lightning model, via a bring-your-own-model setup +This project exists in a separate [GitHub repository](https://github.com/microsoft/InnerEye-DeepLearning). + +## **Microsoft Genomics** +Microsoft has several genomics-related offerings that will be useful to many Cloud Lab users. For a broad overview, visit the [Microsoft Genomics Community site](https://microsoft.github.io/Genomics-Community/index.html). You can also get an overview of different execution options from [this blog](https://techcommunity.microsoft.com/t5/healthcare-and-life-sciences/genomic-workflow-managers-on-microsoft-azure/ba-p/3747052), and a detailed analysis for Nextflow with AWS Batch at [this blog](https://techcommunity.microsoft.com/t5/healthcare-and-life-sciences/rna-sequencing-analysis-on-azure-using-nextflow-configuration/ba-p/3738854). We highlight a few key services here: ++ [Genomics Notebooks](https://github.com/microsoft/genomicsnotebook): These example notebooks highlight many common use cases in genomics research. The Bioconductor/Rstudio notebook will not work in Cloud Lab. To run Rstudio, look at [Posit Workbench from the Marketplace](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/rstudio-5237862.rstudioserverprostandard). ++ [Cromwell on Azure](https://github.com/microsoft/CromwellOnAzure): Documentation on how to spin up the resources needed to run Cromwell on Azure. Note that this service will not work within Cloud Lab because you need high-level permissions, but we list it here for demonstration purposes. ++ [Microsoft Genomics](https://learn.microsoft.com/en-us/azure/genomics/quickstart-run-genomics-workflow-portal): Run BWA and GATK using this managed service. Note that it uses Python 2.7 and thus is not compatible with AzureML (which uses Python 3), but you can run it from any other shell environment. ++ [Nextflow on Azure](https://microsoft.github.io/Genomics-Community/mydoc_nextflow.html): Run Nextflow workflows using Azure Batch. ++ [NVIDIA Parabricks for Secondary Genomics Analysis on Azure](https://techcommunity.microsoft.com/t5/healthcare-and-life-sciences/benchmarking-the-nvidia-clara-parabricks-for-secondary-genomics/ba-p/3722434). Follow this guide to run Parabricks on a VM by pulling the Docker container directly from NVIDIA. + +## **Genome Wide Association Studies** +Genome-wide association studies (GWAS) are large-scale investigations that analyze the genomes of many individuals to identify common genetic variants associated with traits, diseases, or other phenotypes. +- This [NIH CFDE written tutorial](https://training.nih-cfde.org/en/latest/Bioinformatic-Analyses/GWAS-in-the-cloud +) walks you through running a simple GWAS on AWS, thus we converted it to Azure in [this notebook](/tutorials/notebooks/GWAS). Note that the CFDE page has a few other bioinformatics related tutorials like BLAST and Illumina read simulation. +- This blog post [illustrates some of the costs associated](https://techcommunity.microsoft.com/t5/azure-high-performance-computing/azure-to-accelerate-genome-wide-analysis-study/ba-p/2644120) with running GWAS on Azure + +## **NCBI BLAST+** +NCBI BLAST (Basic Local Alignment Search Tool) is a widely used bioinformatics program provided by the National Center for Biotechnology Information (NCBI) that compares nucleotide or protein sequences against a large database to identify similar sequences and infer evolutionary relationships, functional annotations, and structural information. +- [This Microsoft Blog](https://techcommunity.microsoft.com/t5/azure-high-performance-computing/running-ncbi-blast-on-azure-performance-scalability-and-best/ba-p/2410483) explains how to optimize BLAST analyses on Azure VMs. Feel free to install BLAST+ on a VM or an AzureML notebook and run queries there. + +## **Query a VCF file in Azure Synapse** +- You can use SQL to rapidly query a VCF file in Azure Synapse. The requires converting the file from VCF to Parquet format, a common format for databases. Read more about how to do this in Azure on [this Microsoft blog](https://techcommunity.microsoft.com/t5/healthcare-and-life-sciences/genomic-data-in-parquet-format-on-azure/ba-p/3150554). Although the notebooks for this tutorial are bundled with the other genomics notebooks, to get them to work you will need to use Azure Databricks or Synapse Analytics, not AzureML. + +## **RNAseq** +RNA-seq analysis is a high-throughput sequencing method that allows the measurement and characterization of gene expression levels and transcriptome dynamics. Workflows are typically run using workflow managers, and final results can often be visualized in notebooks. +- You can run this [Nextflow on Azure tutorial](https://microsoft.github.io/Genomics-Community/mydoc_nextflow.html) for RNAseq a variety of ways on Azure. Following the instructions outlined above, you could use Virtual Machines, Azure Machine Learning, or Azure Batch. +- For a notebook version of a complete RNAseq pipeline from Fastq to Salmon quantification from the NIGMS Sandbox Program use this [notebook](/tutorials/notebooks/rnaseq-myco-tutorial-main), which we re-wrote to work on Azure. + +## **Single Cell RNAseq** +Single-cell RNA sequencing (scRNA-seq) is a technique that enables the analysis of gene expression at the individual cell level, providing insights into cellular heterogeneity, identifying rare cell types, and revealing cellular dynamics and functional states within complex biological systems. +- This [NVIDIA blog](https://developer.nvidia.com/blog/accelerating-single-cell-genomic-analysis-using-rapids/) details how to run an accelerated scRNAseq pipeline using RAPIDS. You can find a link to the GitHub that has lots of example notebooks [here](https://github.com/clara-parabricks/rapids-single-cell-examples). For each example use case they show some nice benchmarking data with time and cost for CPU vs. GPU machine types on AWS. You will see that most runs cost less than $1.00 with GPU machines (priced on AWS). If you want a CPU version that users Scanpy you can use this [notebook](https://github.com/clara-parabricks/rapids-single-cell-examples/blob/master/notebooks/hlca_lung_cpu_analysis.ipynb). Pay careful attention to the environment setup as there are a lot of dependencies for these notebooks. Create a conda environment in the terminal, then run the notebook. Consider using [mamba](https://github.com/mamba-org/mamba) to speed up environment creation. We created a [guide](/docs/create_conda_env.md) for conda environment set up as well. + +## **Long Read Sequence Analysis** +Long read DNA sequence analysis involves analyzing sequencing reads typically longer than 10 thousand base pairs (bp) in length, compared with short read sequencing where reads are about 150 bp in length. +Oxford Nanopore has a pretty complete offering of notebook tutorials for handling long read data to do a variety of things including variant calling, RNAseq, Sars-Cov-2 analysis and much more. Access the notebooks [here](https://labs.epi2me.io/nbindex/) and on [GitHub](https://github.com/epi2me-labs). These notebooks expect you are running locally and accessing the epi2me notebook server. To run them in Cloud Lab, skip the first cell that connects to the server and then the rest of the notebook should run correctly, with a few tweaks. Oxford Nanopore also offers a host of [Nextflow workflows](https://labs.epi2me.io/wfindex/) that will allow you to run a variety of long read pipelines. + +## **Open Data** +These publicly available datasets can save you time on data discovery and preparation by being curated and ready to use in your workflows. ++ The [COVID-19 Data Lake](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-covid-19-data-lake) contains COVID-19 related datasets from various sources. It covers testing and patient outcome tracking data, social distancing policy, hospital capacity and mobility. ++ In response to the COVID-19 pandemic, the Allen Institute for AI has partnered with leading research groups to prepare and distribute the [COVID-19 Open Research Dataset (CORD-19)](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-covid-19-open-research?tabs=azure-storage). This dataset is a free resource of over 47,000 scholarly articles, including over 36,000 with full text, about COVID-19 and the coronavirus family of viruses for use by the global research community. This dataset mobilizes researchers to apply recent advances in natural language processing to generate new insights in support of the fight against this infectious disease. ++ [The Genomics Data Lake](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-genomics-data-lake) provides various public datasets that you can access for free and integrate into your genomics analysis workflows and applications. The datasets include genome sequences, variant info, and subject/sample metadata in BAM, FASTA, VCF, CSV file formats: [Illumina Platinum Genomes](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-illumina-platinum-genomes), [Human Reference Genomes](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-human-reference-genomes), [ClinVar Annotations](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-clinvar-annotations), [SnpEff](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-snpeff), [Genome Aggregation Database (gnomAD)](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-gnomad), [1000 Genomes](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-1000-genomes), [OpenCravat](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-open-cravat), [ENCODE](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-encode), [GATK Resource Bundle](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-gatk-resource-bundle). diff --git a/tutorials/CycleCloud/CycleCloud_CustomRole.json b/envs/CycleCloud_CustomRole.json similarity index 100% rename from tutorials/CycleCloud/CycleCloud_CustomRole.json rename to envs/CycleCloud_CustomRole.json diff --git a/tutorials/notebooks/GWAS/GWAS_coat_color.ipynb b/notebooks/GWAS/GWAS_coat_color.ipynb similarity index 100% rename from tutorials/notebooks/GWAS/GWAS_coat_color.ipynb rename to notebooks/GWAS/GWAS_coat_color.ipynb diff --git a/tutorials/notebooks/GenAI/Azure_AI_Studio_README.md b/notebooks/GenAI/Azure_AI_Studio_README.md similarity index 100% rename from tutorials/notebooks/GenAI/Azure_AI_Studio_README.md rename to notebooks/GenAI/Azure_AI_Studio_README.md diff --git a/tutorials/notebooks/GenAI/Azure_Open_AI_README.md b/notebooks/GenAI/Azure_Open_AI_README.md similarity index 100% rename from tutorials/notebooks/GenAI/Azure_Open_AI_README.md rename to notebooks/GenAI/Azure_Open_AI_README.md diff --git a/tutorials/notebooks/GenAI/LICENSE b/notebooks/GenAI/LICENSE similarity index 100% rename from tutorials/notebooks/GenAI/LICENSE rename to notebooks/GenAI/LICENSE diff --git a/tutorials/notebooks/GenAI/embedding_demos/acs_embeddings.py b/notebooks/GenAI/embedding_demos/acs_embeddings.py similarity index 100% rename from tutorials/notebooks/GenAI/embedding_demos/acs_embeddings.py rename to notebooks/GenAI/embedding_demos/acs_embeddings.py diff --git a/tutorials/notebooks/GenAI/embedding_demos/aoai_embeddings.py b/notebooks/GenAI/embedding_demos/aoai_embeddings.py similarity index 100% rename from tutorials/notebooks/GenAI/embedding_demos/aoai_embeddings.py rename to notebooks/GenAI/embedding_demos/aoai_embeddings.py diff --git a/tutorials/notebooks/GenAI/example_scripts/example_azureaisearch_openaichat_zeroshot.py b/notebooks/GenAI/example_scripts/example_azureaisearch_openaichat_zeroshot.py similarity index 100% rename from tutorials/notebooks/GenAI/example_scripts/example_azureaisearch_openaichat_zeroshot.py rename to notebooks/GenAI/example_scripts/example_azureaisearch_openaichat_zeroshot.py diff --git a/tutorials/notebooks/GenAI/example_scripts/example_langchain_openaichat_zeroshot.py b/notebooks/GenAI/example_scripts/example_langchain_openaichat_zeroshot.py similarity index 100% rename from tutorials/notebooks/GenAI/example_scripts/example_langchain_openaichat_zeroshot.py rename to notebooks/GenAI/example_scripts/example_langchain_openaichat_zeroshot.py diff --git a/tutorials/notebooks/GenAI/example_scripts/workshop_embedding.py b/notebooks/GenAI/example_scripts/workshop_embedding.py similarity index 100% rename from tutorials/notebooks/GenAI/example_scripts/workshop_embedding.py rename to notebooks/GenAI/example_scripts/workshop_embedding.py diff --git a/tutorials/notebooks/GenAI/example_scripts/workshop_search.py b/notebooks/GenAI/example_scripts/workshop_search.py similarity index 100% rename from tutorials/notebooks/GenAI/example_scripts/workshop_search.py rename to notebooks/GenAI/example_scripts/workshop_search.py diff --git a/tutorials/notebooks/GenAI/microsoft-earnings.csv b/notebooks/GenAI/microsoft-earnings.csv similarity index 100% rename from tutorials/notebooks/GenAI/microsoft-earnings.csv rename to notebooks/GenAI/microsoft-earnings.csv diff --git a/tutorials/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb b/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb similarity index 100% rename from tutorials/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb rename to notebooks/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb diff --git a/tutorials/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb b/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb similarity index 100% rename from tutorials/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb rename to notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb diff --git a/tutorials/notebooks/GenAI/notebooks/AzureAIStudio_langchain.ipynb b/notebooks/GenAI/notebooks/AzureAIStudio_langchain.ipynb similarity index 100% rename from tutorials/notebooks/GenAI/notebooks/AzureAIStudio_langchain.ipynb rename to notebooks/GenAI/notebooks/AzureAIStudio_langchain.ipynb diff --git a/tutorials/notebooks/GenAI/notebooks/AzureOpenAI_embeddings.ipynb b/notebooks/GenAI/notebooks/AzureOpenAI_embeddings.ipynb similarity index 100% rename from tutorials/notebooks/GenAI/notebooks/AzureOpenAI_embeddings.ipynb rename to notebooks/GenAI/notebooks/AzureOpenAI_embeddings.ipynb diff --git a/tutorials/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb b/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb similarity index 100% rename from tutorials/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb rename to notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb diff --git a/tutorials/notebooks/GenAI/requirements.txt b/notebooks/GenAI/requirements.txt similarity index 100% rename from tutorials/notebooks/GenAI/requirements.txt rename to notebooks/GenAI/requirements.txt diff --git a/tutorials/notebooks/GenAI/search_documents/Hurricane_Irene_(2005).pdf b/notebooks/GenAI/search_documents/Hurricane_Irene_(2005).pdf similarity index 100% rename from tutorials/notebooks/GenAI/search_documents/Hurricane_Irene_(2005).pdf rename to notebooks/GenAI/search_documents/Hurricane_Irene_(2005).pdf diff --git a/tutorials/notebooks/GenAI/search_documents/Koutros_et_al_2023.pdf b/notebooks/GenAI/search_documents/Koutros_et_al_2023.pdf similarity index 100% rename from tutorials/notebooks/GenAI/search_documents/Koutros_et_al_2023.pdf rename to notebooks/GenAI/search_documents/Koutros_et_al_2023.pdf diff --git a/tutorials/notebooks/GenAI/search_documents/New_York_State_Route_373.pdf b/notebooks/GenAI/search_documents/New_York_State_Route_373.pdf similarity index 100% rename from tutorials/notebooks/GenAI/search_documents/New_York_State_Route_373.pdf rename to notebooks/GenAI/search_documents/New_York_State_Route_373.pdf diff --git a/tutorials/notebooks/GenAI/search_documents/Rai_et_al_2023.pdf b/notebooks/GenAI/search_documents/Rai_et_al_2023.pdf similarity index 100% rename from tutorials/notebooks/GenAI/search_documents/Rai_et_al_2023.pdf rename to notebooks/GenAI/search_documents/Rai_et_al_2023.pdf diff --git a/tutorials/notebooks/GenAI/search_documents/Silverman_et_al_2023.pdf b/notebooks/GenAI/search_documents/Silverman_et_al_2023.pdf similarity index 100% rename from tutorials/notebooks/GenAI/search_documents/Silverman_et_al_2023.pdf rename to notebooks/GenAI/search_documents/Silverman_et_al_2023.pdf diff --git a/tutorials/notebooks/GenAI/search_documents/aoai_workshop_content.pdf b/notebooks/GenAI/search_documents/aoai_workshop_content.pdf similarity index 100% rename from tutorials/notebooks/GenAI/search_documents/aoai_workshop_content.pdf rename to notebooks/GenAI/search_documents/aoai_workshop_content.pdf diff --git a/tutorials/notebooks/GenAI/search_documents/grant_data_sub1.txt b/notebooks/GenAI/search_documents/grant_data_sub1.txt similarity index 100% rename from tutorials/notebooks/GenAI/search_documents/grant_data_sub1.txt rename to notebooks/GenAI/search_documents/grant_data_sub1.txt diff --git a/tutorials/notebooks/GenAI/search_documents/grant_data_sub2.txt b/notebooks/GenAI/search_documents/grant_data_sub2.txt similarity index 100% rename from tutorials/notebooks/GenAI/search_documents/grant_data_sub2.txt rename to notebooks/GenAI/search_documents/grant_data_sub2.txt diff --git a/tutorials/notebooks/SRADownload/SRA-Download.ipynb b/notebooks/SRADownload/SRA-Download.ipynb similarity index 100% rename from tutorials/notebooks/SRADownload/SRA-Download.ipynb rename to notebooks/SRADownload/SRA-Download.ipynb diff --git a/tutorials/notebooks/SpleenLiverSegmentation/README.md b/notebooks/SpleenLiverSegmentation/README.md similarity index 100% rename from tutorials/notebooks/SpleenLiverSegmentation/README.md rename to notebooks/SpleenLiverSegmentation/README.md diff --git a/tutorials/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb b/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb similarity index 100% rename from tutorials/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb rename to notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb diff --git a/tutorials/notebooks/SpleenLiverSegmentation/monai_data/Spleen_best_metric_model_pretrained.pth b/notebooks/SpleenLiverSegmentation/monai_data/Spleen_best_metric_model_pretrained.pth similarity index 100% rename from tutorials/notebooks/SpleenLiverSegmentation/monai_data/Spleen_best_metric_model_pretrained.pth rename to notebooks/SpleenLiverSegmentation/monai_data/Spleen_best_metric_model_pretrained.pth diff --git a/tutorials/notebooks/pangolin/pangolin.yaml b/notebooks/pangolin/pangolin.yaml similarity index 100% rename from tutorials/notebooks/pangolin/pangolin.yaml rename to notebooks/pangolin/pangolin.yaml diff --git a/tutorials/notebooks/pangolin/pangolin_pipeline.ipynb b/notebooks/pangolin/pangolin_pipeline.ipynb similarity index 100% rename from tutorials/notebooks/pangolin/pangolin_pipeline.ipynb rename to notebooks/pangolin/pangolin_pipeline.ipynb diff --git a/tutorials/notebooks/rnaseq-myco-tutorial-main/LICENSE b/notebooks/rnaseq-myco-tutorial-main/LICENSE similarity index 100% rename from tutorials/notebooks/rnaseq-myco-tutorial-main/LICENSE rename to notebooks/rnaseq-myco-tutorial-main/LICENSE diff --git a/tutorials/notebooks/rnaseq-myco-tutorial-main/README.md b/notebooks/rnaseq-myco-tutorial-main/README.md similarity index 100% rename from tutorials/notebooks/rnaseq-myco-tutorial-main/README.md rename to notebooks/rnaseq-myco-tutorial-main/README.md diff --git a/tutorials/notebooks/rnaseq-myco-tutorial-main/RNAseq_pipeline.ipynb b/notebooks/rnaseq-myco-tutorial-main/RNAseq_pipeline.ipynb similarity index 100% rename from tutorials/notebooks/rnaseq-myco-tutorial-main/RNAseq_pipeline.ipynb rename to notebooks/rnaseq-myco-tutorial-main/RNAseq_pipeline.ipynb diff --git a/tutorials/notebooks/rnaseq-myco-tutorial-main/images/count-workflow.png b/notebooks/rnaseq-myco-tutorial-main/images/count-workflow.png similarity index 100% rename from tutorials/notebooks/rnaseq-myco-tutorial-main/images/count-workflow.png rename to notebooks/rnaseq-myco-tutorial-main/images/count-workflow.png diff --git a/tutorials/notebooks/rnaseq-myco-tutorial-main/images/rnaseq-workflow.png b/notebooks/rnaseq-myco-tutorial-main/images/rnaseq-workflow.png similarity index 100% rename from tutorials/notebooks/rnaseq-myco-tutorial-main/images/rnaseq-workflow.png rename to notebooks/rnaseq-myco-tutorial-main/images/rnaseq-workflow.png diff --git a/tutorials/notebooks/rnaseq-myco-tutorial-main/images/table-cushman.png b/notebooks/rnaseq-myco-tutorial-main/images/table-cushman.png similarity index 100% rename from tutorials/notebooks/rnaseq-myco-tutorial-main/images/table-cushman.png rename to notebooks/rnaseq-myco-tutorial-main/images/table-cushman.png diff --git a/tutorials/README.md b/tutorials/README.md deleted file mode 100644 index 3a44017..0000000 --- a/tutorials/README.md +++ /dev/null @@ -1,85 +0,0 @@ -# Microsoft Azure Tutorial Resources - ---------------------------------- -## Overview of Page Contents - -+ [Artificial Intelligence](#ai) -+ [Clinical Informatics](#ci) -+ [Medical Imaging](#mi) -+ [Genomics on Azure](#bio) -+ [GWAS](#gwas) -+ [BLAST](#blast) -+ [VCF Query](#vcf) -+ [RNAseq](#rna) -+ [scRNAseq](#sc) -+ [Long Read Sequencing Analysis](#long) -+ [Open Data](#open) - -## **Artificial Intelligence** -Machine learning is a subfield of artificial intelligence that focuses on the development of algorithms and models that enable computers to learn from and make predictions or decisions based on data, without being explicitly programmed. Artificial intelligence and machine learning algorithms are being applied to a variety of biomedical research questions, ranging from image classification to genomic variant calling. Azure offers AI services through Azure AI Studio and Azure Machine Learning. - -See our suite of tutorials to learn more about [Gen AI on Azure](/tutorials/notebooks/GenAI/) that highlight Azure products such as [Azure AI Studio](/tutorials/notebooks/GenAI/Azure_AI_Studio_README.md), [Azure OpenAI](/tutorials/notebooks/GenAI/Azure_Open_AI_README.md) and [Azure AI Search](/tutorials/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb) and external tools like [Langchain](/tutorials/notebooks/GenAI/notebooks/AzureOpenAI-langchain.ipynb). These notebooks walk you through how to deploy, train, and query models, as well as how to implement techniques like [Retrieval-Augmented Generation (RAG)](/tutorials/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb). If you are interested in applying RAG to structured data like a csv file, we created one tutorials that walks you through how to index your csv using the [Azure UI](/docs/create_index_from_csv.md) and a query your database using a [notebook within Azure ML](/tutorials/notebooks/GenAI/notebooks/llm_query_csv.ipynb), and [one tutorial that runs all the necessary steps directly from a notebook](/tutorials/notebooks/GenAI/notebooks/azure_ai_search_structured.ipynb). - - ## **Clinical Informatics with FHIR** - Azure Health Data Services is a set of services that enables you to store, process, and analyze medical data in Azure. These services are designed to help organizations quickly connect disparate health data sources and formats, such as structured, imaging, and device data, and normalize it to be persisted in the cloud. At its core, Azure Health Data Services possesses the ability to transform and ingest data into FHIR (Fast Healthcare Interoperability Resources) format. This allows you to transform health data from legacy formats, such as HL7v2 or CDA, or from high-frequency IoT data in device proprietary formats to FHIR. This makes it easier to connect data stored in Azure Health Data Services with services across the Azure ecosystem, like Azure Synapse Analytics, and Azure Machine Learning (Azure ML). - -Azure Health Data Services includes support for multiple health data standards for the exchange of structured data, and the ability to deploy multiple instances of different service types (FHIR, DICOM, and MedTech) that seamlessly work with one another. Services deployed within a workspace also share a compliance boundary and common configuration settings. The product scales automatically to meet the varying demands of your workloads, so you spend less time managing infrastructure and more time generating insights from health data. - -Copying healthcare data stored in Azure FHIR Server to Synapse Analytics allows researchers to leverage a cloud-scale data warehousing and analytics tool to extract insights from their data as well as build scalable research pipelines. -For information on how to perform this export and downstream analytics, please visit [this repository](https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/healthcare-apis/fhir/copy-to-synapse.md). - -You can also see hands-on examples of using [FHIR on Azure](https://github.com/microsoft/genomicsnotebook/tree/main/fhirgenomics), but note that you will need to supply your own VCF files as these are not provided with the tutorial content. - -## **Medical Imaging Analysis** -Medical imaging analysis requires the analysis of large image files and often requires elastic storage and accelerated computing. Microsoft Azure offers cloud-based medical imaging analysis capabilities through its Azure Healthcare APIs and Azure Medical Imaging solutions. Azure's DICOM Service allows for the secure storage, management, and processing of medical images in the cloud, using industry standard DICOM (Digital Imaging and Communications in Medicine) format. The DICOM Service provides features like high availability, disaster recovery, and scalable storage options, making it an ideal solution for pipelines that need to store, manage, and analyze large amounts of medical imaging data. In addition, the server integrates with other Azure services like Azure ML, facilitating the use of advanced machine learning algorithms for image analysis tasks such as object detection, segmentation, and classification. Read about how to deploy the service [here](https://learn.microsoft.com/en-us/azure/healthcare-apis/dicom/deploy-dicom-services-in-azure). - -Microsoft has several medical imaging notebooks that showcase different medical imaging use-cases on Azure Machine Learning. These notebooks demonstrate various data science techniques such as manual model development with PyTorch, automated machine learning, and MLOPS-based examples for automating the machine learning lifecycle in medical use cases, including retraining. -These notebooks are available [here](https://github.com/Azure/medical-imaging). Make sure you select a kernel that includes Pytorch else the install of dependencies can be challenging. Note also that you need to use a GPU VM for most of the notebook cells, but you can create several compute environments and switch between them as needed. Be sure to shut them off when you are finished. - -For Cloud Lab users interested in multi-modal clinical informatics, DICOMcast provides the ability to synchronize data from a DICOM service to a FHIR service, allowing users to integrate clinical and imaging data. DICOMcast expands the use cases for health data by supporting both a streamlined view of longitudinal patient data and the ability to effectively create cohorts for medical studies, analytics, and machine learning. For more information on how to utilize DICOMcast please visit Microsoft’s [documentation](https://learn.microsoft.com/en-us/azure/healthcare-apis/dicom/dicom-cast-overview) or the open-source [GitHub repository](https://github.com/microsoft/dicom-server/blob/main/docs/quickstarts/deploy-dicom-cast.md). - -For users hoping to train deep learning models on imaging data, InnerEye-DeepLearning (IE-DL) is a toolbox that Microsoft developed for easily training deep learning models on 3D medical images. Simple to run both locally and in the cloud with Azure Machine Learning, it allows users to train and run inference on the following: -• Segmentation models -• Classification and regression models -• Any PyTorch Lightning model, via a bring-your-own-model setup -This project exists in a separate [GitHub repository](https://github.com/microsoft/InnerEye-DeepLearning). - -## **Microsoft Genomics** -Microsoft has several genomics-related offerings that will be useful to many Cloud Lab users. For a broad overview, visit the [Microsoft Genomics Community site](https://microsoft.github.io/Genomics-Community/index.html). You can also get an overview of different execution options from [this blog](https://techcommunity.microsoft.com/t5/healthcare-and-life-sciences/genomic-workflow-managers-on-microsoft-azure/ba-p/3747052), and a detailed analysis for Nextflow with AWS Batch at [this blog](https://techcommunity.microsoft.com/t5/healthcare-and-life-sciences/rna-sequencing-analysis-on-azure-using-nextflow-configuration/ba-p/3738854). We highlight a few key services here: -+ [Genomics Notebooks](https://github.com/microsoft/genomicsnotebook): These example notebooks highlight many common use cases in genomics research. The Bioconductor/Rstudio notebook will not work in Cloud Lab. To run Rstudio, look at [Posit Workbench from the Marketplace](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/rstudio-5237862.rstudioserverprostandard). -+ [Cromwell on Azure](https://github.com/microsoft/CromwellOnAzure): Documentation on how to spin up the resources needed to run Cromwell on Azure. Note that this service will not work within Cloud Lab because you need high-level permissions, but we list it here for demonstration purposes. -+ [Microsoft Genomics](https://learn.microsoft.com/en-us/azure/genomics/quickstart-run-genomics-workflow-portal): Run BWA and GATK using this managed service. Note that it uses Python 2.7 and thus is not compatible with AzureML (which uses Python 3), but you can run it from any other shell environment. -+ [Nextflow on Azure](https://microsoft.github.io/Genomics-Community/mydoc_nextflow.html): Run Nextflow workflows using Azure Batch. -+ [NVIDIA Parabricks for Secondary Genomics Analysis on Azure](https://techcommunity.microsoft.com/t5/healthcare-and-life-sciences/benchmarking-the-nvidia-clara-parabricks-for-secondary-genomics/ba-p/3722434). Follow this guide to run Parabricks on a VM by pulling the Docker container directly from NVIDIA. - -## **Genome Wide Association Studies** -Genome-wide association studies (GWAS) are large-scale investigations that analyze the genomes of many individuals to identify common genetic variants associated with traits, diseases, or other phenotypes. -- This [NIH CFDE written tutorial](https://training.nih-cfde.org/en/latest/Bioinformatic-Analyses/GWAS-in-the-cloud -) walks you through running a simple GWAS on AWS, thus we converted it to Azure in [this notebook](/tutorials/notebooks/GWAS). Note that the CFDE page has a few other bioinformatics related tutorials like BLAST and Illumina read simulation. -- This blog post [illustrates some of the costs associated](https://techcommunity.microsoft.com/t5/azure-high-performance-computing/azure-to-accelerate-genome-wide-analysis-study/ba-p/2644120) with running GWAS on Azure - -## **NCBI BLAST+** -NCBI BLAST (Basic Local Alignment Search Tool) is a widely used bioinformatics program provided by the National Center for Biotechnology Information (NCBI) that compares nucleotide or protein sequences against a large database to identify similar sequences and infer evolutionary relationships, functional annotations, and structural information. -- [This Microsoft Blog](https://techcommunity.microsoft.com/t5/azure-high-performance-computing/running-ncbi-blast-on-azure-performance-scalability-and-best/ba-p/2410483) explains how to optimize BLAST analyses on Azure VMs. Feel free to install BLAST+ on a VM or an AzureML notebook and run queries there. - -## **Query a VCF file in Azure Synapse** -- You can use SQL to rapidly query a VCF file in Azure Synapse. The requires converting the file from VCF to Parquet format, a common format for databases. Read more about how to do this in Azure on [this Microsoft blog](https://techcommunity.microsoft.com/t5/healthcare-and-life-sciences/genomic-data-in-parquet-format-on-azure/ba-p/3150554). Although the notebooks for this tutorial are bundled with the other genomics notebooks, to get them to work you will need to use Azure Databricks or Synapse Analytics, not AzureML. - -## **RNAseq** -RNA-seq analysis is a high-throughput sequencing method that allows the measurement and characterization of gene expression levels and transcriptome dynamics. Workflows are typically run using workflow managers, and final results can often be visualized in notebooks. -- You can run this [Nextflow on Azure tutorial](https://microsoft.github.io/Genomics-Community/mydoc_nextflow.html) for RNAseq a variety of ways on Azure. Following the instructions outlined above, you could use Virtual Machines, Azure Machine Learning, or Azure Batch. -- For a notebook version of a complete RNAseq pipeline from Fastq to Salmon quantification from the NIGMS Sandbox Program use this [notebook](/tutorials/notebooks/rnaseq-myco-tutorial-main), which we re-wrote to work on Azure. - -## **Single Cell RNAseq** -Single-cell RNA sequencing (scRNA-seq) is a technique that enables the analysis of gene expression at the individual cell level, providing insights into cellular heterogeneity, identifying rare cell types, and revealing cellular dynamics and functional states within complex biological systems. -- This [NVIDIA blog](https://developer.nvidia.com/blog/accelerating-single-cell-genomic-analysis-using-rapids/) details how to run an accelerated scRNAseq pipeline using RAPIDS. You can find a link to the GitHub that has lots of example notebooks [here](https://github.com/clara-parabricks/rapids-single-cell-examples). For each example use case they show some nice benchmarking data with time and cost for CPU vs. GPU machine types on AWS. You will see that most runs cost less than $1.00 with GPU machines (priced on AWS). If you want a CPU version that users Scanpy you can use this [notebook](https://github.com/clara-parabricks/rapids-single-cell-examples/blob/master/notebooks/hlca_lung_cpu_analysis.ipynb). Pay careful attention to the environment setup as there are a lot of dependencies for these notebooks. Create a conda environment in the terminal, then run the notebook. Consider using [mamba](https://github.com/mamba-org/mamba) to speed up environment creation. We created a [guide](/docs/create_conda_env.md) for conda environment set up as well. - -## **Long Read Sequence Analysis** -Long read DNA sequence analysis involves analyzing sequencing reads typically longer than 10 thousand base pairs (bp) in length, compared with short read sequencing where reads are about 150 bp in length. -Oxford Nanopore has a pretty complete offering of notebook tutorials for handling long read data to do a variety of things including variant calling, RNAseq, Sars-Cov-2 analysis and much more. Access the notebooks [here](https://labs.epi2me.io/nbindex/) and on [GitHub](https://github.com/epi2me-labs). These notebooks expect you are running locally and accessing the epi2me notebook server. To run them in Cloud Lab, skip the first cell that connects to the server and then the rest of the notebook should run correctly, with a few tweaks. Oxford Nanopore also offers a host of [Nextflow workflows](https://labs.epi2me.io/wfindex/) that will allow you to run a variety of long read pipelines. - -## **Open Data** -These publicly available datasets can save you time on data discovery and preparation by being curated and ready to use in your workflows. -+ The [COVID-19 Data Lake](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-covid-19-data-lake) contains COVID-19 related datasets from various sources. It covers testing and patient outcome tracking data, social distancing policy, hospital capacity and mobility. -+ In response to the COVID-19 pandemic, the Allen Institute for AI has partnered with leading research groups to prepare and distribute the [COVID-19 Open Research Dataset (CORD-19)](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-covid-19-open-research?tabs=azure-storage). This dataset is a free resource of over 47,000 scholarly articles, including over 36,000 with full text, about COVID-19 and the coronavirus family of viruses for use by the global research community. This dataset mobilizes researchers to apply recent advances in natural language processing to generate new insights in support of the fight against this infectious disease. -+ [The Genomics Data Lake](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-genomics-data-lake) provides various public datasets that you can access for free and integrate into your genomics analysis workflows and applications. The datasets include genome sequences, variant info, and subject/sample metadata in BAM, FASTA, VCF, CSV file formats: [Illumina Platinum Genomes](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-illumina-platinum-genomes), [Human Reference Genomes](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-human-reference-genomes), [ClinVar Annotations](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-clinvar-annotations), [SnpEff](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-snpeff), [Genome Aggregation Database (gnomAD)](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-gnomad), [1000 Genomes](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-1000-genomes), [OpenCravat](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-open-cravat), [ENCODE](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-encode), [GATK Resource Bundle](https://learn.microsoft.com/en-us/azure/open-datasets/dataset-gatk-resource-bundle). From e27d356c84e6640221097ec6c106b6d2dbb192c7 Mon Sep 17 00:00:00 2001 From: Kyle O'Connell Date: Thu, 29 Feb 2024 14:09:19 -0500 Subject: [PATCH 04/25] added tagline to readme --- README.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/README.md b/README.md index 3a44017..d47377c 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,11 @@ +>This repository falls under the NIH STRIDES Initiative. STRIDES aims to harness the power of the cloud to accelerate biomedical discoveries. To learn more, visit https://cloud.nih.gov. + # Microsoft Azure Tutorial Resources +NIH Cloud Lab’s goal is to make Cloud easy and accessible for you, so that you can spend less time on administrative tasks and focus more on research. + +Use this repository to learn about how to use Azure by exploring the linked resources and walking through the tutorials. If you are a beginner, we suggest you begin the jumpstart section on the [Cloud Lab website](https://cloud.nih.gov/resources/cloudlab/). + --------------------------------- ## Overview of Page Contents From 9a26fe96c5969e26817a907b6420fc32799a8a86 Mon Sep 17 00:00:00 2001 From: Kyle O'Connell Date: Thu, 29 Feb 2024 14:15:43 -0500 Subject: [PATCH 05/25] modified links --- .markdown-link-check.json | 4 ++-- README.md | 8 ++++---- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/.markdown-link-check.json b/.markdown-link-check.json index b394a03..1b4f096 100644 --- a/.markdown-link-check.json +++ b/.markdown-link-check.json @@ -13,8 +13,8 @@ "replacement": "https://github.com/STRIDES/NIHCloudLabAzure/tree/main/docs" }, { - "pattern": "^/tutorials", - "replacement": "https://github.com/STRIDES/NIHCloudLabAzure/tree/main/tutorials" + "pattern": "^/notebooks/", + "replacement": "https://github.com/STRIDES/NIHCloudLabAzure/tree/main/notebooks" } ], diff --git a/README.md b/README.md index d47377c..c5bb462 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ NIH Cloud Lab’s goal is to make Cloud easy and accessible for you, so that you can spend less time on administrative tasks and focus more on research. -Use this repository to learn about how to use Azure by exploring the linked resources and walking through the tutorials. If you are a beginner, we suggest you begin the jumpstart section on the [Cloud Lab website](https://cloud.nih.gov/resources/cloudlab/). +Use this repository to learn about how to use Azure by exploring the linked resources and walking through the tutorials. If you are a beginner, we suggest you begin the jumpstart section on the [Cloud Lab website](https://cloud.nih.gov/resources/cloudlab/) before returning here. --------------------------------- ## Overview of Page Contents @@ -24,7 +24,7 @@ Use this repository to learn about how to use Azure by exploring the linked reso ## **Artificial Intelligence** Machine learning is a subfield of artificial intelligence that focuses on the development of algorithms and models that enable computers to learn from and make predictions or decisions based on data, without being explicitly programmed. Artificial intelligence and machine learning algorithms are being applied to a variety of biomedical research questions, ranging from image classification to genomic variant calling. Azure offers AI services through Azure AI Studio and Azure Machine Learning. -See our suite of tutorials to learn more about [Gen AI on Azure](/tutorials/notebooks/GenAI/) that highlight Azure products such as [Azure AI Studio](/tutorials/notebooks/GenAI/Azure_AI_Studio_README.md), [Azure OpenAI](/tutorials/notebooks/GenAI/Azure_Open_AI_README.md) and [Azure AI Search](/tutorials/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb) and external tools like [Langchain](/tutorials/notebooks/GenAI/notebooks/AzureOpenAI-langchain.ipynb). These notebooks walk you through how to deploy, train, and query models, as well as how to implement techniques like [Retrieval-Augmented Generation (RAG)](/tutorials/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb). If you are interested in applying RAG to structured data like a csv file, we created one tutorials that walks you through how to index your csv using the [Azure UI](/docs/create_index_from_csv.md) and a query your database using a [notebook within Azure ML](/tutorials/notebooks/GenAI/notebooks/llm_query_csv.ipynb), and [one tutorial that runs all the necessary steps directly from a notebook](/tutorials/notebooks/GenAI/notebooks/azure_ai_search_structured.ipynb). +See our suite of tutorials to learn more about [Gen AI on Azure](/notebooks/GenAI/) that highlight Azure products such as [Azure AI Studio](/notebooks/GenAI/Azure_AI_Studio_README.md), [Azure OpenAI](/notebooks/GenAI/Azure_Open_AI_README.md) and [Azure AI Search](/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb) and external tools like [Langchain](/notebooks/GenAI/notebooks/AzureOpenAI-langchain.ipynb). These notebooks walk you through how to deploy, train, and query models, as well as how to implement techniques like [Retrieval-Augmented Generation (RAG)](/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb). If you are interested in applying RAG to structured data like a csv file, we created one tutorials that walks you through how to index your csv using the [Azure UI](/docs/create_index_from_csv.md) and a query your database using a [notebook within Azure ML](/notebooks/GenAI/notebooks/llm_query_csv.ipynb), and [one tutorial that runs all the necessary steps directly from a notebook](/notebooks/GenAI/notebooks/azure_ai_search_structured.ipynb). ## **Clinical Informatics with FHIR** Azure Health Data Services is a set of services that enables you to store, process, and analyze medical data in Azure. These services are designed to help organizations quickly connect disparate health data sources and formats, such as structured, imaging, and device data, and normalize it to be persisted in the cloud. At its core, Azure Health Data Services possesses the ability to transform and ingest data into FHIR (Fast Healthcare Interoperability Resources) format. This allows you to transform health data from legacy formats, such as HL7v2 or CDA, or from high-frequency IoT data in device proprietary formats to FHIR. This makes it easier to connect data stored in Azure Health Data Services with services across the Azure ecosystem, like Azure Synapse Analytics, and Azure Machine Learning (Azure ML). @@ -61,7 +61,7 @@ Microsoft has several genomics-related offerings that will be useful to many Clo ## **Genome Wide Association Studies** Genome-wide association studies (GWAS) are large-scale investigations that analyze the genomes of many individuals to identify common genetic variants associated with traits, diseases, or other phenotypes. - This [NIH CFDE written tutorial](https://training.nih-cfde.org/en/latest/Bioinformatic-Analyses/GWAS-in-the-cloud -) walks you through running a simple GWAS on AWS, thus we converted it to Azure in [this notebook](/tutorials/notebooks/GWAS). Note that the CFDE page has a few other bioinformatics related tutorials like BLAST and Illumina read simulation. +) walks you through running a simple GWAS on AWS, thus we converted it to Azure in [this notebook](/notebooks/GWAS). Note that the CFDE page has a few other bioinformatics related tutorials like BLAST and Illumina read simulation. - This blog post [illustrates some of the costs associated](https://techcommunity.microsoft.com/t5/azure-high-performance-computing/azure-to-accelerate-genome-wide-analysis-study/ba-p/2644120) with running GWAS on Azure ## **NCBI BLAST+** @@ -74,7 +74,7 @@ NCBI BLAST (Basic Local Alignment Search Tool) is a widely used bioinformatics p ## **RNAseq** RNA-seq analysis is a high-throughput sequencing method that allows the measurement and characterization of gene expression levels and transcriptome dynamics. Workflows are typically run using workflow managers, and final results can often be visualized in notebooks. - You can run this [Nextflow on Azure tutorial](https://microsoft.github.io/Genomics-Community/mydoc_nextflow.html) for RNAseq a variety of ways on Azure. Following the instructions outlined above, you could use Virtual Machines, Azure Machine Learning, or Azure Batch. -- For a notebook version of a complete RNAseq pipeline from Fastq to Salmon quantification from the NIGMS Sandbox Program use this [notebook](/tutorials/notebooks/rnaseq-myco-tutorial-main), which we re-wrote to work on Azure. +- For a notebook version of a complete RNAseq pipeline from Fastq to Salmon quantification from the NIGMS Sandbox Program use this [notebook](/notebooks/rnaseq-myco-tutorial-main), which we re-wrote to work on Azure. ## **Single Cell RNAseq** Single-cell RNA sequencing (scRNA-seq) is a technique that enables the analysis of gene expression at the individual cell level, providing insights into cellular heterogeneity, identifying rare cell types, and revealing cellular dynamics and functional states within complex biological systems. From 7e0c1d120121c1488698f7f4a4fd20666abcf144 Mon Sep 17 00:00:00 2001 From: Kyle O'Connell Date: Tue, 5 Mar 2024 16:45:30 -0500 Subject: [PATCH 06/25] notebooks/ git push --- notebooks/GWAS/GWAS_coat_color.ipynb | 444 +++++++++++------- .../notebooks/Azure_Pubmed_chatbot.ipynb | 5 - 2 files changed, 265 insertions(+), 184 deletions(-) diff --git a/notebooks/GWAS/GWAS_coat_color.ipynb b/notebooks/GWAS/GWAS_coat_color.ipynb index 21b9b35..d2f4d1c 100644 --- a/notebooks/GWAS/GWAS_coat_color.ipynb +++ b/notebooks/GWAS/GWAS_coat_color.ipynb @@ -2,447 +2,536 @@ "cells": [ { "cell_type": "markdown", + "id": "7a244bb3", + "metadata": {}, "source": [ - "# GWAS in the cloud\n", - "We adapted the NIH CFDE tutorial from [here](https://training.nih-cfde.org/en/latest/Bioinformatic-Analyses/GWAS-in-the-cloud/background/) and fit it to a notebook. We have greatly simplified the instructions, so if you need or want more details, look at the full tutorial to find out more.\n", - "Most of this notebook is bash, but expects that you are using a Python kernel, until step 3, plotting, you will need to switch your kernel to R." - ], + "# Runing Genome Wide Association Studies in the cloud" + ] + }, + { + "cell_type": "markdown", "metadata": {}, - "id": "7a244bb3" + "source": [ + "## Overview\n", + "Genome Wide Association Study analyses are conducted via the command line using mostly BASH commands, and then plotting often done using Python or R. Here, we adapted an [NIH CFDE tutorial](https://training.nih-cfde.org/en/latest/Bioinformatic-Analyses/GWAS-in-the-cloud/background/) and fit it to a notebook. We have greatly simplified the instructions, so if you need or want more details, look at the full tutorial to find out more.\n", + "\n", + "Most of this notebook is bash, but expects that you are using a Python kernel, until step 3, plotting, you will need to switch your kernel to R." + ] }, { "cell_type": "markdown", + "id": "8fbf6304", + "metadata": {}, "source": [ "## 1. Setup\n", "### Download the data\n", "use %%bash to denote a bash block. You can also use '!' to denote a single bash command within a Python notebook" - ], - "metadata": {}, - "id": "8fbf6304" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "8ec900bd", + "metadata": { + "vscode": { + "languageId": "r" + } + }, + "outputs": [], "source": [ "%%bash\n", "mkdir GWAS\n", "curl -LO https://de.cyverse.org/dl/d/E0A502CC-F806-4857-9C3A-BAEAA0CCC694/pruned_coatColor_maf_geno.vcf.gz\n", "curl -LO https://de.cyverse.org/dl/d/3B5C1853-C092-488C-8C2F-CE6E8526E96B/coatColor.pheno" - ], - "outputs": [], - "execution_count": null, - "metadata": {}, - "id": "8ec900bd" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "4d43ae73", + "metadata": { + "vscode": { + "languageId": "r" + } + }, + "outputs": [], "source": [ "%%bash\n", "mv *.gz GWAS\n", "mv *.pheno GWAS\n", "ls GWAS" - ], - "outputs": [], - "execution_count": null, - "metadata": {}, - "id": "4d43ae73" + ] }, { "attachments": {}, "cell_type": "markdown", + "id": "28aadbf8", + "metadata": {}, "source": [ "### Install dependencies\n", "Here we install mamba, which is faster than conda. You could also skip this install and just use conda since that is preinstalled in the kernel." - ], - "metadata": {}, - "id": "28aadbf8" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "b3ba3eef", + "metadata": { + "vscode": { + "languageId": "r" + } + }, + "outputs": [], "source": [ "%%bash\n", "curl -L -O https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh\n", "bash Mambaforge-$(uname)-$(uname -m).sh -b -p $HOME/mambaforge" - ], - "outputs": [], - "execution_count": null, - "metadata": {}, - "id": "b3ba3eef" + ] }, { "cell_type": "code", - "source": [ - "#add to your path\n", - "import os\n", - "os.environ[\"PATH\"] += os.pathsep + os.environ[\"HOME\"]+\"/mambaforge/bin\"" - ], - "outputs": [], "execution_count": null, + "id": "ae20d01c", "metadata": { "gather": { "logged": 1686580882939 + }, + "vscode": { + "languageId": "r" } }, - "id": "ae20d01c" + "outputs": [], + "source": [ + "#add to your path\n", + "import os\n", + "os.environ[\"PATH\"] += os.pathsep + os.environ[\"HOME\"]+\"/mambaforge/bin\"" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "b219074a", + "metadata": { + "vscode": { + "languageId": "r" + } + }, + "outputs": [], "source": [ "! mamba install -y -c bioconda plink vcftools" - ], - "outputs": [], - "execution_count": null, - "metadata": {}, - "id": "b219074a" + ] }, { "cell_type": "markdown", + "id": "3de2fc4c", + "metadata": {}, "source": [ "## 2. Analyze" - ], - "metadata": {}, - "id": "3de2fc4c" + ] }, { "cell_type": "markdown", + "id": "013d960d", + "metadata": {}, "source": [ "### Make map and ped files from the vcf file to feed into plink" - ], - "metadata": {}, - "id": "013d960d" + ] }, { "cell_type": "code", - "source": [ - "cd GWAS" - ], - "outputs": [], "execution_count": null, + "id": "e91c7a01", "metadata": { "gather": { "logged": 1686579597925 + }, + "vscode": { + "languageId": "r" } }, - "id": "e91c7a01" + "outputs": [], + "source": [ + "cd GWAS" + ] }, { "cell_type": "code", - "source": [ - "ls GWAS" - ], - "outputs": [], "execution_count": null, + "id": "9b770f7f", "metadata": { "gather": { "logged": 1686579600325 + }, + "vscode": { + "languageId": "r" } }, - "id": "9b770f7f" + "outputs": [], + "source": [ + "ls GWAS" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "6570875d", + "metadata": { + "vscode": { + "languageId": "r" + } + }, + "outputs": [], "source": [ "! vcftools --gzvcf pruned_coatColor_maf_geno.vcf.gz --plink --out coatColor" - ], - "outputs": [], - "execution_count": null, - "metadata": {}, - "id": "6570875d" + ] }, { "cell_type": "markdown", + "id": "b9a38761", + "metadata": {}, "source": [ "### Create a list of minor alleles.\n", "For more info on these terms, look at step 2 at https://training.nih-cfde.org/en/latest/Bioinformatic-Analyses/GWAS-in-the-cloud/analyze/" - ], - "metadata": {}, - "id": "b9a38761" + ] }, { "cell_type": "code", - "source": [ - "#unzip vcf\n", - "! vcftools --gzvcf pruned_coatColor_maf_geno.vcf.gz --recode --out pruned_coatColor_maf_geno" - ], - "outputs": [], "execution_count": null, + "id": "6c868a67", "metadata": { "gather": { "logged": 1686581972147 + }, + "vscode": { + "languageId": "r" } }, - "id": "6c868a67" + "outputs": [], + "source": [ + "#unzip vcf\n", + "! vcftools --gzvcf pruned_coatColor_maf_geno.vcf.gz --recode --out pruned_coatColor_maf_geno" + ] }, { "cell_type": "code", - "source": [ - "#create list of minor alleles\n", - "! cat pruned_coatColor_maf_geno.recode.vcf | awk 'BEGIN{FS=\"\\t\";OFS=\"\\t\";}/#/{next;}{{if($3==\".\")$3=$1\":\"$2;}print $3,$5;}' > minor_alleles" - ], - "outputs": [], "execution_count": null, + "id": "8e11f991", "metadata": { "gather": { "logged": 1686581979545 + }, + "vscode": { + "languageId": "r" } }, - "id": "8e11f991" + "outputs": [], + "source": [ + "#create list of minor alleles\n", + "! cat pruned_coatColor_maf_geno.recode.vcf | awk 'BEGIN{FS=\"\\t\";OFS=\"\\t\";}/#/{next;}{{if($3==\".\")$3=$1\":\"$2;}print $3,$5;}' > minor_alleles" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "8cff47e3", + "metadata": { + "vscode": { + "languageId": "r" + } + }, + "outputs": [], "source": [ "! head minor_alleles" - ], - "outputs": [], - "execution_count": null, - "metadata": {}, - "id": "8cff47e3" + ] }, { "cell_type": "markdown", + "id": "56d901c7", + "metadata": {}, "source": [ "### Run quality controls" - ], - "metadata": {}, - "id": "56d901c7" + ] }, { "cell_type": "code", - "source": [ - "#calculate missingness per locus\n", - "! plink --file coatColor --make-pheno coatColor.pheno \"yellow\" --missing --out miss_stat --noweb --dog --reference-allele minor_alleles --allow-no-sex --adjust" - ], - "outputs": [], "execution_count": null, + "id": "dafa14a6", "metadata": { "gather": { "logged": 1686582023237 + }, + "vscode": { + "languageId": "r" } }, - "id": "dafa14a6" + "outputs": [], + "source": [ + "#calculate missingness per locus\n", + "! plink --file coatColor --make-pheno coatColor.pheno \"yellow\" --missing --out miss_stat --noweb --dog --reference-allele minor_alleles --allow-no-sex --adjust" + ] }, { "cell_type": "code", - "source": [ - "#take a look at lmiss, which is the per locus rates of missingness\n", - "! head miss_stat.lmiss" - ], - "outputs": [], "execution_count": null, + "id": "5cf5f51b", "metadata": { "gather": { "logged": 1686582030150 + }, + "vscode": { + "languageId": "r" } }, - "id": "5cf5f51b" + "outputs": [], + "source": [ + "#take a look at lmiss, which is the per locus rates of missingness\n", + "! head miss_stat.lmiss" + ] }, { "cell_type": "code", - "source": [ - "#peek at imiss which is the individual rates of missingness\n", - "! head miss_stat.imiss" - ], - "outputs": [], "execution_count": null, + "id": "915bb263", "metadata": { "gather": { "logged": 1686582034753 + }, + "vscode": { + "languageId": "r" } }, - "id": "915bb263" + "outputs": [], + "source": [ + "#peek at imiss which is the individual rates of missingness\n", + "! head miss_stat.imiss" + ] }, { "cell_type": "markdown", + "id": "4c11ca71", + "metadata": {}, "source": [ "### Convert to plink binary format" - ], - "metadata": {}, - "id": "4c11ca71" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "3b8f2d7f", + "metadata": { + "vscode": { + "languageId": "r" + } + }, + "outputs": [], "source": [ "! plink --file coatColor --allow-no-sex --dog --make-bed --noweb --out coatColor.binary" - ], - "outputs": [], - "execution_count": null, - "metadata": {}, - "id": "3b8f2d7f" + ] }, { "cell_type": "markdown", + "id": "e36f6cd7", + "metadata": {}, "source": [ "### Run a simple association step (the GWAS part!)" - ], - "metadata": {}, - "id": "e36f6cd7" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "f926ef9b", + "metadata": { + "vscode": { + "languageId": "r" + } + }, + "outputs": [], "source": [ "! plink --bfile coatColor.binary --make-pheno coatColor.pheno \"yellow\" --assoc --reference-allele minor_alleles --allow-no-sex --adjust --dog --noweb --out coatColor" - ], - "outputs": [], - "execution_count": null, - "metadata": {}, - "id": "f926ef9b" + ] }, { "cell_type": "markdown", + "id": "b397d484", + "metadata": {}, "source": [ "### Identify statistical cutoffs\n", "This code finds the equivalent of 0.05 and 0.01 p value in the negative-log-transformed p values file. We will use these cutoffs to draw horizontal lines in the Manhattan plot for visualization of haplotypes that cross the 0.05 and 0.01 statistical threshold (i.e. have a statistically significant association with yellow coat color)" - ], - "metadata": {}, - "id": "b397d484" + ] }, { "cell_type": "code", + "execution_count": null, + "id": "b94e1e2a", + "metadata": { + "vscode": { + "languageId": "r" + } + }, + "outputs": [], "source": [ "%%bash\n", "unad_cutoff_sug=$(tail -n+2 coatColor.assoc.adjusted | awk '$10>=0.05' | head -n1 | awk '{print $3}')\n", "unad_cutoff_conf=$(tail -n+2 coatColor.assoc.adjusted | awk '$10>=0.01' | head -n1 | awk '{print $3}')" - ], - "outputs": [], - "execution_count": null, - "metadata": {}, - "id": "b94e1e2a" + ] }, { "cell_type": "markdown", + "id": "1f52e97c", + "metadata": {}, "source": [ "## 3. Plotting\n", "In this tutorial, plotting is done in R. Azure gets a bit funny about running these R commands, so we recommend just runnning the rest of the commands in the Terminal. Run `R` before running the commands. Otherwise you can just download the inputs and run locally in R studio." - ], - "metadata": {}, - "id": "1f52e97c" + ] }, { "cell_type": "markdown", + "id": "effb5acd", + "metadata": {}, "source": [ "### Install qqman" - ], - "metadata": {}, - "id": "effb5acd" + ] }, { "cell_type": "code", - "source": [ - "install.packages('qqman', contriburl=contrib.url('http://cran.r-project.org/'))" - ], - "outputs": [], "execution_count": null, + "id": "60feed89", "metadata": { "gather": { "logged": 1686582094642 + }, + "vscode": { + "languageId": "r" } }, - "id": "60feed89" + "outputs": [], + "source": [ + "install.packages('qqman', contriburl=contrib.url('http://cran.r-project.org/'))" + ] }, { "cell_type": "markdown", + "id": "d3f1fcd2", + "metadata": {}, "source": [ "### Run the plotting function" - ], - "metadata": {}, - "id": "d3f1fcd2" + ] }, { "cell_type": "code", - "source": [ - "#make sure you are still CD in GWAS, when you change kernel it may reset to home\n", - "setwd('GWAS')" - ], - "outputs": [], "execution_count": null, + "id": "a7e8cd2b", "metadata": { "gather": { "logged": 1686584355516 + }, + "vscode": { + "languageId": "r" } }, - "id": "a7e8cd2b" + "outputs": [], + "source": [ + "#make sure you are still CD in GWAS, when you change kernel it may reset to home\n", + "setwd('GWAS')" + ] }, { "cell_type": "code", - "source": [ - "require(qqman)" - ], - "outputs": [], "execution_count": null, + "id": "7946a3a7", "metadata": { "gather": { "logged": 1686584356532 + }, + "vscode": { + "languageId": "r" } }, - "id": "7946a3a7" + "outputs": [], + "source": [ + "require(qqman)" + ] }, { "cell_type": "code", - "source": [ - "data=read.table(\"coatColor.assoc\", header=TRUE)" - ], - "outputs": [], "execution_count": null, + "id": "0d28ef2c", "metadata": { "gather": { "logged": 1686584364339 + }, + "vscode": { + "languageId": "r" } }, - "id": "0d28ef2c" + "outputs": [], + "source": [ + "data=read.table(\"coatColor.assoc\", header=TRUE)" + ] }, { "cell_type": "code", - "source": [ - "data=data[!is.na(data$P),]" - ], - "outputs": [], "execution_count": null, + "id": "8e5207be", "metadata": { "gather": { "logged": 1686584368241 + }, + "vscode": { + "languageId": "r" } }, - "id": "8e5207be" + "outputs": [], + "source": [ + "data=data[!is.na(data$P),]" + ] }, { "cell_type": "code", - "source": [ - "manhattan(data, p = \"P\", col = c(\"blue4\", \"orange3\"),\n", - " suggestiveline = 12,\n", - " genomewideline = 15,\n", - " chrlabs = c(1:38, \"X\"), annotateTop=TRUE, cex = 1.2)" - ], - "outputs": [], "execution_count": null, + "id": "6330b1e0", "metadata": { "gather": { "logged": 1686584371278 + }, + "vscode": { + "languageId": "r" } }, - "id": "6330b1e0" + "outputs": [], + "source": [ + "manhattan(data, p = \"P\", col = c(\"blue4\", \"orange3\"),\n", + " suggestiveline = 12,\n", + " genomewideline = 15,\n", + " chrlabs = c(1:38, \"X\"), annotateTop=TRUE, cex = 1.2)" + ] }, { "cell_type": "markdown", + "id": "26787d84", + "metadata": {}, "source": [ "In our graph, haplotypes in four parts of the genome (chromosome 2, 5, 28 and X) are found to be associated with an increased occurrence of the yellow coat color phenotype.\n", "\n", "The top associated mutation is a nonsense SNP in the gene MC1R known to control pigment production. The MC1R allele encoding yellow coat color contains a single base change (from C to T) at the 916th nucleotide." - ], - "metadata": {}, - "id": "26787d84" + ] } ], "metadata": { + "kernel_info": { + "name": "ir" + }, "kernelspec": { - "name": "ir", + "display_name": "R", "language": "R", - "display_name": "R" + "name": "ir" }, "language_info": { - "name": "R", "codemirror_mode": "r", - "pygments_lexer": "r", - "mimetype": "text/x-r-source", "file_extension": ".r", + "mimetype": "text/x-r-source", + "name": "R", + "pygments_lexer": "r", "version": "4.2.2" }, "microsoft": { @@ -450,13 +539,10 @@ "ms_spell_check_language": "en" } }, - "kernel_info": { - "name": "ir" - }, "nteract": { "version": "nteract-front-end@1.0.0" } }, "nbformat": 4, "nbformat_minor": 5 -} \ No newline at end of file +} diff --git a/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb b/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb index be462cd..247b973 100644 --- a/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb +++ b/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb @@ -8,11 +8,6 @@ "# Creating a PubMed Chatbot using Azure" ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [] - }, { "cell_type": "markdown", "metadata": {}, From 9a214941fea7b01d2023295759461666f46e8d2b Mon Sep 17 00:00:00 2001 From: Kyle O'Connell Date: Tue, 5 Mar 2024 17:03:02 -0500 Subject: [PATCH 07/25] fixed some capitalization issues in notebooks --- .../notebooks/AzureAIStudio_index_structured_notebook.ipynb | 4 ++-- .../AzureAIStudio_index_structured_with_console.ipynb | 4 ++-- notebooks/GenAI/notebooks/AzureAIStudio_langchain.ipynb | 2 +- notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb | 4 ++-- 4 files changed, 7 insertions(+), 7 deletions(-) diff --git a/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb b/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb index 8ba1e89..cd56dd8 100644 --- a/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb +++ b/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb @@ -40,7 +40,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Get Started" + "## Get started" ] }, { @@ -783,7 +783,7 @@ "id": "0459e0ae-5183-4b6a-9eca-41c97b0b8a8c", "metadata": {}, "source": [ - "## Clean Up" + "## Clean up" ] }, { diff --git a/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb b/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb index a13ae1f..29ad25b 100644 --- a/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb +++ b/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb @@ -47,7 +47,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Get Started" + "## Get started" ] }, { @@ -357,7 +357,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Clean Up" + "## Clean up" ] }, { diff --git a/notebooks/GenAI/notebooks/AzureAIStudio_langchain.ipynb b/notebooks/GenAI/notebooks/AzureAIStudio_langchain.ipynb index 04d8ba7..3cce50b 100644 --- a/notebooks/GenAI/notebooks/AzureAIStudio_langchain.ipynb +++ b/notebooks/GenAI/notebooks/AzureAIStudio_langchain.ipynb @@ -50,7 +50,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Get Started" + "## Get started" ] }, { diff --git a/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb b/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb index 247b973..70363cd 100644 --- a/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb +++ b/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb @@ -41,7 +41,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Get Started" + "## Get started" ] }, { @@ -1565,7 +1565,7 @@ "id": "a178c1c6-368a-48c5-8beb-278443b685a2", "metadata": {}, "source": [ - "## Clean Up" + "## Clean up" ] }, { From 4b60261739573de05a8c814528102f6fc8f08f82 Mon Sep 17 00:00:00 2001 From: Kyle O'Connell Date: Tue, 5 Mar 2024 17:19:11 -0500 Subject: [PATCH 08/25] reformatted some notebooks, gwas and pangolin --- notebooks/GWAS/GWAS_coat_color.ipynb | 57 +- notebooks/pangolin/pangolin.yaml | 13 - notebooks/pangolin/pangolin_pipeline.ipynb | 1112 ++------------------ 3 files changed, 116 insertions(+), 1066 deletions(-) delete mode 100644 notebooks/pangolin/pangolin.yaml diff --git a/notebooks/GWAS/GWAS_coat_color.ipynb b/notebooks/GWAS/GWAS_coat_color.ipynb index d2f4d1c..0bcfbc1 100644 --- a/notebooks/GWAS/GWAS_coat_color.ipynb +++ b/notebooks/GWAS/GWAS_coat_color.ipynb @@ -18,12 +18,34 @@ "Most of this notebook is bash, but expects that you are using a Python kernel, until step 3, plotting, you will need to switch your kernel to R." ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Prerequisites\n", + "We assume you have provisioned a compute environment in Azure ML Studio" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Learning objectives\n", + "+ Learn how to run GWAS analysis and visualize results in Azure AI Studio" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Get started" + ] + }, { "cell_type": "markdown", "id": "8fbf6304", "metadata": {}, "source": [ - "## 1. Setup\n", "### Download the data\n", "use %%bash to denote a bash block. You can also use '!' to denote a single bash command within a Python notebook" ] @@ -68,7 +90,7 @@ "id": "28aadbf8", "metadata": {}, "source": [ - "### Install dependencies\n", + "### Install packages\n", "Here we install mamba, which is faster than conda. You could also skip this install and just use conda since that is preinstalled in the kernel." ] }, @@ -121,14 +143,6 @@ "! mamba install -y -c bioconda plink vcftools" ] }, - { - "cell_type": "markdown", - "id": "3de2fc4c", - "metadata": {}, - "source": [ - "## 2. Analyze" - ] - }, { "cell_type": "markdown", "id": "013d960d", @@ -380,7 +394,7 @@ "id": "1f52e97c", "metadata": {}, "source": [ - "## 3. Plotting\n", + "### Plotting\n", "In this tutorial, plotting is done in R. Azure gets a bit funny about running these R commands, so we recommend just runnning the rest of the commands in the Terminal. Run `R` before running the commands. Otherwise you can just download the inputs and run locally in R studio." ] }, @@ -515,6 +529,27 @@ "\n", "The top associated mutation is a nonsense SNP in the gene MC1R known to control pigment production. The MC1R allele encoding yellow coat color contains a single base change (from C to T) at the 916th nucleotide." ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Conclusions\n", + "You learned here how to run and visualize GWAS results using a notebook in Azure ML Studio." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Clean Up\n", + "Make sure you stop your compute instance and if desired, delete the resource group associated with this tutorial." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [] } ], "metadata": { diff --git a/notebooks/pangolin/pangolin.yaml b/notebooks/pangolin/pangolin.yaml deleted file mode 100644 index 4a235b0..0000000 --- a/notebooks/pangolin/pangolin.yaml +++ /dev/null @@ -1,13 +0,0 @@ -name: pangolin -channels: - - bioconda - - conda-forge - - defaults - - eaton-lab - -dependencies: - - sra-tools - - ipyrad - - toytree - - pangolin - - iqtree diff --git a/notebooks/pangolin/pangolin_pipeline.ipynb b/notebooks/pangolin/pangolin_pipeline.ipynb index 862c45a..3603c9a 100644 --- a/notebooks/pangolin/pangolin_pipeline.ipynb +++ b/notebooks/pangolin/pangolin_pipeline.ipynb @@ -10,810 +10,106 @@ }, { "cell_type": "markdown", - "id": "56a29212", "metadata": {}, "source": [ - "We are going to run a standard covid bioinformatics pipeline using the Pangolin workflow. https://cov-lineages.org/resources/pangolin/usage.html" + "## Overview \n", + "SARS-CoV-2 sequence is usually analyzed using a bioinformatic pipeline called Pangolin. Here we will download some genomic data and run Pangolin following [standard instructions](https://cov-lineages.org/resources/pangolin/usage.html). " ] }, { "cell_type": "markdown", - "id": "03541941", - "metadata": {}, - "source": [ - "### Required software" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "id": "f994b990", "metadata": {}, - "outputs": [], "source": [ - "#change this depending on how many threads are available in your notebook\n", - "CPU=4" + "## Prerequisites\n", + "We assume you have access to Azure AI Studio and have already deployed an LLM " ] }, { - "cell_type": "code", - "execution_count": 2, - "id": "a19b662e", + "cell_type": "markdown", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "PREFIX=/home/ec2-user/mambaforge\n", - "Unpacking payload ...\n", - "Extracting \"libmambapy-0.19.0-py39h8bfa403_0.tar.bz2\"\n", - "Extracting \"zstd-1.5.0-ha95c52a_0.tar.bz2\"\n", - "Extracting \"readline-8.1-h46c0cb4_0.tar.bz2\"\n", - "Extracting \"yaml-cpp-0.6.3-he1b5a44_4.tar.bz2\"\n", - "Extracting \"libgcc-ng-11.2.0-h1d223b6_11.tar.bz2\"\n", - "Extracting \"cffi-1.15.0-py39h4bc2ebd_0.tar.bz2\"\n", - "Extracting \"wheel-0.37.0-pyhd8ed1ab_1.tar.bz2\"\n", - "Extracting \"colorama-0.4.4-pyh9f0ad1d_0.tar.bz2\"\n", - "Extracting \"_openmp_mutex-4.5-1_gnu.tar.bz2\"\n", - "Extracting \"tzdata-2021e-he74cb21_0.tar.bz2\"\n", - "Extracting \"reproc-cpp-14.2.3-h9c3ff4c_0.tar.bz2\"\n", - "Extracting \"lz4-c-1.9.3-h9c3ff4c_1.tar.bz2\"\n", - "Extracting \"libarchive-3.5.2-hccf745f_1.tar.bz2\"\n", - "Extracting \"libedit-3.1.20191231-he28a2e2_2.tar.bz2\"\n", - "Extracting \"icu-69.1-h9c3ff4c_0.tar.bz2\"\n", - "Extracting \"ld_impl_linux-64-2.36.1-hea4e1c9_2.tar.bz2\"\n", - "Extracting \"libzlib-1.2.11-h36c2ea0_1013.tar.bz2\"\n", - "Extracting \"tqdm-4.62.3-pyhd8ed1ab_0.tar.bz2\"\n", - "Extracting \"ncurses-6.2-h58526e2_4.tar.bz2\"\n", - "Extracting \"openssl-1.1.1l-h7f98852_0.tar.bz2\"\n", - "Extracting \"cryptography-36.0.0-py39h95dcef6_0.tar.bz2\"\n", - "Extracting \"zlib-1.2.11-h36c2ea0_1013.tar.bz2\"\n", - "Extracting \"lzo-2.10-h516909a_1000.tar.bz2\"\n", - "Extracting \"c-ares-1.18.1-h7f98852_0.tar.bz2\"\n", - "Extracting \"pyopenssl-21.0.0-pyhd8ed1ab_0.tar.bz2\"\n", - "Extracting \"conda-package-handling-1.7.3-py39h3811e60_1.tar.bz2\"\n", - "Extracting \"idna-3.1-pyhd3deb0d_0.tar.bz2\"\n", - "Extracting \"libmamba-0.19.0-h3985d26_0.tar.bz2\"\n", - "Extracting \"reproc-14.2.3-h7f98852_0.tar.bz2\"\n", - "Extracting \"pip-21.3.1-pyhd8ed1ab_0.tar.bz2\"\n", - "Extracting \"tk-8.6.11-h27826a3_1.tar.bz2\"\n", - "Extracting \"conda-4.11.0-py39hf3d152e_0.tar.bz2\"\n", - "Extracting \"requests-2.26.0-pyhd8ed1ab_1.tar.bz2\"\n", - "Extracting \"_libgcc_mutex-0.1-conda_forge.tar.bz2\"\n", - "Extracting \"brotlipy-0.7.0-py39h3811e60_1003.tar.bz2\"\n", - "Extracting \"python-3.9.7-hb7a2778_3_cpython.tar.bz2\"\n", - "Extracting \"yaml-0.2.5-h516909a_0.tar.bz2\"\n", - "Extracting \"bzip2-1.0.8-h7f98852_4.tar.bz2\"\n", - "Extracting \"libffi-3.4.2-h7f98852_5.tar.bz2\"\n", - "Extracting \"krb5-1.19.2-hcc1bbae_3.tar.bz2\"\n", - "Extracting \"charset-normalizer-2.0.9-pyhd8ed1ab_0.tar.bz2\"\n", - "Extracting \"pysocks-1.7.1-py39hf3d152e_4.tar.bz2\"\n", - "Extracting \"libgomp-11.2.0-h1d223b6_11.tar.bz2\"\n", - "Extracting \"pybind11-abi-4-hd8ed1ab_3.tar.bz2\"\n", - "Extracting \"python_abi-3.9-2_cp39.tar.bz2\"\n", - "Extracting \"libiconv-1.16-h516909a_0.tar.bz2\"\n", - "Extracting \"libcurl-7.80.0-h2574ce0_0.tar.bz2\"\n", - "Extracting \"libxml2-2.9.12-h885dcf4_1.tar.bz2\"\n", - "Extracting \"pycosat-0.6.3-py39h3811e60_1009.tar.bz2\"\n", - "Extracting \"certifi-2021.10.8-py39hf3d152e_1.tar.bz2\"\n", - "Extracting \"libssh2-1.10.0-ha56f1ee_2.tar.bz2\"\n", - "Extracting \"libnghttp2-1.43.0-h812cca2_1.tar.bz2\"\n", - "Extracting \"mamba-0.19.0-py39hfa8f2c8_0.tar.bz2\"\n", - "Extracting \"ruamel_yaml-0.15.80-py39h3811e60_1006.tar.bz2\"\n", - "Extracting \"xz-5.2.5-h516909a_1.tar.bz2\"\n", - "Extracting \"setuptools-59.4.0-py39hf3d152e_0.tar.bz2\"\n", - "Extracting \"six-1.16.0-pyh6c4a22f_0.tar.bz2\"\n", - "Extracting \"urllib3-1.26.7-pyhd8ed1ab_0.tar.bz2\"\n", - "Extracting \"libstdcxx-ng-11.2.0-he4da1e4_11.tar.bz2\"\n", - "Extracting \"libsolv-0.7.19-h780b84a_5.tar.bz2\"\n", - "Extracting \"pycparser-2.21-pyhd8ed1ab_0.tar.bz2\"\n", - "Extracting \"ca-certificates-2021.10.8-ha878542_0.tar.bz2\"\n", - "Extracting \"sqlite-3.37.0-h9cd32fc_0.tar.bz2\"\n", - "Extracting \"libev-4.33-h516909a_1.tar.bz2\"\n", - "\n", - " __\n", - " __ ______ ___ ____ _____ ___ / /_ ____ _\n", - " / / / / __ `__ \\/ __ `/ __ `__ \\/ __ \\/ __ `/\n", - " / /_/ / / / / / / /_/ / / / / / / /_/ / /_/ /\n", - " / .___/_/ /_/ /_/\\__,_/_/ /_/ /_/_.___/\\__,_/\n", - " /_/\n", - "\n", - "\n", - "Transaction\n", - "\n", - " Prefix: /home/ec2-user/mambaforge\n", - "\n", - " Updating specs:\n", - "\n", - " - python==3.9.7=hb7a2778_3_cpython\n", - " - _libgcc_mutex==0.1=conda_forge\n", - " - ca-certificates==2021.10.8=ha878542_0\n", - " - ld_impl_linux-64==2.36.1=hea4e1c9_2\n", - " - libstdcxx-ng==11.2.0=he4da1e4_11\n", - " - pybind11-abi==4=hd8ed1ab_3\n", - " - tzdata==2021e=he74cb21_0\n", - " - libgomp==11.2.0=h1d223b6_11\n", - " - _openmp_mutex==4.5=1_gnu\n", - " - libgcc-ng==11.2.0=h1d223b6_11\n", - " - bzip2==1.0.8=h7f98852_4\n", - " - c-ares==1.18.1=h7f98852_0\n", - " - icu==69.1=h9c3ff4c_0\n", - " - libev==4.33=h516909a_1\n", - " - libffi==3.4.2=h7f98852_5\n", - " - libiconv==1.16=h516909a_0\n", - " - libzlib==1.2.11=h36c2ea0_1013\n", - " - lz4-c==1.9.3=h9c3ff4c_1\n", - " - lzo==2.10=h516909a_1000\n", - " - ncurses==6.2=h58526e2_4\n", - " - openssl==1.1.1l=h7f98852_0\n", - " - reproc==14.2.3=h7f98852_0\n", - " - xz==5.2.5=h516909a_1\n", - " - yaml==0.2.5=h516909a_0\n", - " - yaml-cpp==0.6.3=he1b5a44_4\n", - " - libedit==3.1.20191231=he28a2e2_2\n", - " - readline==8.1=h46c0cb4_0\n", - " - reproc-cpp==14.2.3=h9c3ff4c_0\n", - " - zlib==1.2.11=h36c2ea0_1013\n", - " - libnghttp2==1.43.0=h812cca2_1\n", - " - libsolv==0.7.19=h780b84a_5\n", - " - libssh2==1.10.0=ha56f1ee_2\n", - " - libxml2==2.9.12=h885dcf4_1\n", - " - sqlite==3.37.0=h9cd32fc_0\n", - " - tk==8.6.11=h27826a3_1\n", - " - zstd==1.5.0=ha95c52a_0\n", - " - krb5==1.19.2=hcc1bbae_3\n", - " - libarchive==3.5.2=hccf745f_1\n", - " - charset-normalizer==2.0.9=pyhd8ed1ab_0\n", - " - colorama==0.4.4=pyh9f0ad1d_0\n", - " - idna==3.1=pyhd3deb0d_0\n", - " - libcurl==7.80.0=h2574ce0_0\n", - " - pycparser==2.21=pyhd8ed1ab_0\n", - " - python_abi==3.9=2_cp39\n", - " - six==1.16.0=pyh6c4a22f_0\n", - " - wheel==0.37.0=pyhd8ed1ab_1\n", - " - certifi==2021.10.8=py39hf3d152e_1\n", - " - cffi==1.15.0=py39h4bc2ebd_0\n", - " - libmamba==0.19.0=h3985d26_0\n", - " - pycosat==0.6.3=py39h3811e60_1009\n", - " - pysocks==1.7.1=py39hf3d152e_4\n", - " - ruamel_yaml==0.15.80=py39h3811e60_1006\n", - " - setuptools==59.4.0=py39hf3d152e_0\n", - " - tqdm==4.62.3=pyhd8ed1ab_0\n", - " - brotlipy==0.7.0=py39h3811e60_1003\n", - " - conda-package-handling==1.7.3=py39h3811e60_1\n", - " - cryptography==36.0.0=py39h95dcef6_0\n", - " - libmambapy==0.19.0=py39h8bfa403_0\n", - " - pip==21.3.1=pyhd8ed1ab_0\n", - " - pyopenssl==21.0.0=pyhd8ed1ab_0\n", - " - urllib3==1.26.7=pyhd8ed1ab_0\n", - " - requests==2.26.0=pyhd8ed1ab_1\n", - " - conda==4.11.0=py39hf3d152e_0\n", - " - mamba==0.19.0=py39hfa8f2c8_0\n", - "\n", - "\n", - " Package Version Build Channel Size\n", - "───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\n", - " Install:\n", - "───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\n", - "\n", - " + _libgcc_mutex 0.1 conda_forge conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2 Cached\n", - " + _openmp_mutex 4.5 1_gnu conda-forge/linux-64/_openmp_mutex-4.5-1_gnu.tar.bz2 Cached\n", - " + brotlipy 0.7.0 py39h3811e60_1003 conda-forge/linux-64/brotlipy-0.7.0-py39h3811e60_1003.tar.bz2 Cached\n", - " + bzip2 1.0.8 h7f98852_4 conda-forge/linux-64/bzip2-1.0.8-h7f98852_4.tar.bz2 Cached\n", - " + c-ares 1.18.1 h7f98852_0 conda-forge/linux-64/c-ares-1.18.1-h7f98852_0.tar.bz2 Cached\n", - " + ca-certificates 2021.10.8 ha878542_0 conda-forge/linux-64/ca-certificates-2021.10.8-ha878542_0.tar.bz2 Cached\n", - " + certifi 2021.10.8 py39hf3d152e_1 conda-forge/linux-64/certifi-2021.10.8-py39hf3d152e_1.tar.bz2 Cached\n", - " + cffi 1.15.0 py39h4bc2ebd_0 conda-forge/linux-64/cffi-1.15.0-py39h4bc2ebd_0.tar.bz2 Cached\n", - " + charset-normalizer 2.0.9 pyhd8ed1ab_0 conda-forge/noarch/charset-normalizer-2.0.9-pyhd8ed1ab_0.tar.bz2 Cached\n", - " + colorama 0.4.4 pyh9f0ad1d_0 conda-forge/noarch/colorama-0.4.4-pyh9f0ad1d_0.tar.bz2 Cached\n", - " + conda 4.11.0 py39hf3d152e_0 conda-forge/linux-64/conda-4.11.0-py39hf3d152e_0.tar.bz2 Cached\n", - " + conda-package-handling 1.7.3 py39h3811e60_1 conda-forge/linux-64/conda-package-handling-1.7.3-py39h3811e60_1.tar.bz2 Cached\n", - " + cryptography 36.0.0 py39h95dcef6_0 conda-forge/linux-64/cryptography-36.0.0-py39h95dcef6_0.tar.bz2 Cached\n", - " + icu 69.1 h9c3ff4c_0 conda-forge/linux-64/icu-69.1-h9c3ff4c_0.tar.bz2 Cached\n", - " + idna 3.1 pyhd3deb0d_0 conda-forge/noarch/idna-3.1-pyhd3deb0d_0.tar.bz2 Cached\n", - " + krb5 1.19.2 hcc1bbae_3 conda-forge/linux-64/krb5-1.19.2-hcc1bbae_3.tar.bz2 Cached\n", - " + ld_impl_linux-64 2.36.1 hea4e1c9_2 conda-forge/linux-64/ld_impl_linux-64-2.36.1-hea4e1c9_2.tar.bz2 Cached\n", - " + libarchive 3.5.2 hccf745f_1 conda-forge/linux-64/libarchive-3.5.2-hccf745f_1.tar.bz2 Cached\n", - " + libcurl 7.80.0 h2574ce0_0 conda-forge/linux-64/libcurl-7.80.0-h2574ce0_0.tar.bz2 Cached\n", - " + libedit 3.1.20191231 he28a2e2_2 conda-forge/linux-64/libedit-3.1.20191231-he28a2e2_2.tar.bz2 Cached\n", - " + libev 4.33 h516909a_1 conda-forge/linux-64/libev-4.33-h516909a_1.tar.bz2 Cached\n", - " + libffi 3.4.2 h7f98852_5 conda-forge/linux-64/libffi-3.4.2-h7f98852_5.tar.bz2 Cached\n", - " + libgcc-ng 11.2.0 h1d223b6_11 conda-forge/linux-64/libgcc-ng-11.2.0-h1d223b6_11.tar.bz2 Cached\n", - " + libgomp 11.2.0 h1d223b6_11 conda-forge/linux-64/libgomp-11.2.0-h1d223b6_11.tar.bz2 Cached\n", - " + libiconv 1.16 h516909a_0 conda-forge/linux-64/libiconv-1.16-h516909a_0.tar.bz2 Cached\n", - " + libmamba 0.19.0 h3985d26_0 conda-forge/linux-64/libmamba-0.19.0-h3985d26_0.tar.bz2 Cached\n", - " + libmambapy 0.19.0 py39h8bfa403_0 conda-forge/linux-64/libmambapy-0.19.0-py39h8bfa403_0.tar.bz2 Cached\n", - " + libnghttp2 1.43.0 h812cca2_1 conda-forge/linux-64/libnghttp2-1.43.0-h812cca2_1.tar.bz2 Cached\n", - " + libsolv 0.7.19 h780b84a_5 conda-forge/linux-64/libsolv-0.7.19-h780b84a_5.tar.bz2 Cached\n", - " + libssh2 1.10.0 ha56f1ee_2 conda-forge/linux-64/libssh2-1.10.0-ha56f1ee_2.tar.bz2 Cached\n", - " + libstdcxx-ng 11.2.0 he4da1e4_11 conda-forge/linux-64/libstdcxx-ng-11.2.0-he4da1e4_11.tar.bz2 Cached\n", - " + libxml2 2.9.12 h885dcf4_1 conda-forge/linux-64/libxml2-2.9.12-h885dcf4_1.tar.bz2 Cached\n", - " + libzlib 1.2.11 h36c2ea0_1013 conda-forge/linux-64/libzlib-1.2.11-h36c2ea0_1013.tar.bz2 Cached\n", - " + lz4-c 1.9.3 h9c3ff4c_1 conda-forge/linux-64/lz4-c-1.9.3-h9c3ff4c_1.tar.bz2 Cached\n", - " + lzo 2.10 h516909a_1000 conda-forge/linux-64/lzo-2.10-h516909a_1000.tar.bz2 Cached\n", - " + mamba 0.19.0 py39hfa8f2c8_0 conda-forge/linux-64/mamba-0.19.0-py39hfa8f2c8_0.tar.bz2 Cached\n", - " + ncurses 6.2 h58526e2_4 conda-forge/linux-64/ncurses-6.2-h58526e2_4.tar.bz2 Cached\n", - " + openssl 1.1.1l h7f98852_0 conda-forge/linux-64/openssl-1.1.1l-h7f98852_0.tar.bz2 Cached\n", - " + pip 21.3.1 pyhd8ed1ab_0 conda-forge/noarch/pip-21.3.1-pyhd8ed1ab_0.tar.bz2 Cached\n", - " + pybind11-abi 4 hd8ed1ab_3 conda-forge/noarch/pybind11-abi-4-hd8ed1ab_3.tar.bz2 Cached\n", - " + pycosat 0.6.3 py39h3811e60_1009 conda-forge/linux-64/pycosat-0.6.3-py39h3811e60_1009.tar.bz2 Cached\n", - " + pycparser 2.21 pyhd8ed1ab_0 conda-forge/noarch/pycparser-2.21-pyhd8ed1ab_0.tar.bz2 Cached\n", - " + pyopenssl 21.0.0 pyhd8ed1ab_0 conda-forge/noarch/pyopenssl-21.0.0-pyhd8ed1ab_0.tar.bz2 Cached\n", - " + pysocks 1.7.1 py39hf3d152e_4 conda-forge/linux-64/pysocks-1.7.1-py39hf3d152e_4.tar.bz2 Cached\n", - " + python 3.9.7 hb7a2778_3_cpython conda-forge/linux-64/python-3.9.7-hb7a2778_3_cpython.tar.bz2 Cached\n", - " + python_abi 3.9 2_cp39 conda-forge/linux-64/python_abi-3.9-2_cp39.tar.bz2 Cached\n", - " + readline 8.1 h46c0cb4_0 conda-forge/linux-64/readline-8.1-h46c0cb4_0.tar.bz2 Cached\n", - " + reproc 14.2.3 h7f98852_0 conda-forge/linux-64/reproc-14.2.3-h7f98852_0.tar.bz2 Cached\n", - " + reproc-cpp 14.2.3 h9c3ff4c_0 conda-forge/linux-64/reproc-cpp-14.2.3-h9c3ff4c_0.tar.bz2 Cached\n", - " + requests 2.26.0 pyhd8ed1ab_1 conda-forge/noarch/requests-2.26.0-pyhd8ed1ab_1.tar.bz2 Cached\n", - " + ruamel_yaml 0.15.80 py39h3811e60_1006 conda-forge/linux-64/ruamel_yaml-0.15.80-py39h3811e60_1006.tar.bz2 Cached\n", - " + setuptools 59.4.0 py39hf3d152e_0 conda-forge/linux-64/setuptools-59.4.0-py39hf3d152e_0.tar.bz2 Cached\n", - " + six 1.16.0 pyh6c4a22f_0 conda-forge/noarch/six-1.16.0-pyh6c4a22f_0.tar.bz2 Cached\n", - " + sqlite 3.37.0 h9cd32fc_0 conda-forge/linux-64/sqlite-3.37.0-h9cd32fc_0.tar.bz2 Cached\n", - " + tk 8.6.11 h27826a3_1 conda-forge/linux-64/tk-8.6.11-h27826a3_1.tar.bz2 Cached\n", - " + tqdm 4.62.3 pyhd8ed1ab_0 conda-forge/noarch/tqdm-4.62.3-pyhd8ed1ab_0.tar.bz2 Cached\n", - " + tzdata 2021e he74cb21_0 conda-forge/noarch/tzdata-2021e-he74cb21_0.tar.bz2 Cached\n", - " + urllib3 1.26.7 pyhd8ed1ab_0 conda-forge/noarch/urllib3-1.26.7-pyhd8ed1ab_0.tar.bz2 Cached\n", - " + wheel 0.37.0 pyhd8ed1ab_1 conda-forge/noarch/wheel-0.37.0-pyhd8ed1ab_1.tar.bz2 Cached\n", - " + xz 5.2.5 h516909a_1 conda-forge/linux-64/xz-5.2.5-h516909a_1.tar.bz2 Cached\n", - " + yaml 0.2.5 h516909a_0 conda-forge/linux-64/yaml-0.2.5-h516909a_0.tar.bz2 Cached\n", - " + yaml-cpp 0.6.3 he1b5a44_4 conda-forge/linux-64/yaml-cpp-0.6.3-he1b5a44_4.tar.bz2 Cached\n", - " + zlib 1.2.11 h36c2ea0_1013 conda-forge/linux-64/zlib-1.2.11-h36c2ea0_1013.tar.bz2 Cached\n", - " + zstd 1.5.0 ha95c52a_0 conda-forge/linux-64/zstd-1.5.0-ha95c52a_0.tar.bz2 Cached\n", - "\n", - " Summary:\n", - "\n", - " Install: 64 packages\n", - "\n", - " Total download: 0 B\n", - "\n", - "───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\n", - "\n", - "\n", - "\n", - "Transaction starting\n", - "Linking ca-certificates-2021.10.8-ha878542_0\n", - "Linking ld_impl_linux-64-2.36.1-hea4e1c9_2\n", - "Linking libstdcxx-ng-11.2.0-he4da1e4_11\n", - "Linking pybind11-abi-4-hd8ed1ab_3\n", - "Linking _libgcc_mutex-0.1-conda_forge\n", - "Linking tzdata-2021e-he74cb21_0\n", - "Linking libgomp-11.2.0-h1d223b6_11\n", - "Linking _openmp_mutex-4.5-1_gnu\n", - "Linking libgcc-ng-11.2.0-h1d223b6_11\n", - "Linking ncurses-6.2-h58526e2_4\n", - "Linking libzlib-1.2.11-h36c2ea0_1013\n", - "Linking libiconv-1.16-h516909a_0\n", - "Linking libev-4.33-h516909a_1\n", - "Linking yaml-cpp-0.6.3-he1b5a44_4\n", - "Linking yaml-0.2.5-h516909a_0\n", - "Linking xz-5.2.5-h516909a_1\n", - "Linking reproc-14.2.3-h7f98852_0\n", - "Linking openssl-1.1.1l-h7f98852_0\n", - "Linking lzo-2.10-h516909a_1000\n", - "Linking lz4-c-1.9.3-h9c3ff4c_1\n", - "Linking libffi-3.4.2-h7f98852_5\n", - "Linking icu-69.1-h9c3ff4c_0\n", - "Linking c-ares-1.18.1-h7f98852_0\n", - "Linking bzip2-1.0.8-h7f98852_4\n", - "Linking readline-8.1-h46c0cb4_0\n", - "Linking libedit-3.1.20191231-he28a2e2_2\n", - "Linking zlib-1.2.11-h36c2ea0_1013\n", - "Linking reproc-cpp-14.2.3-h9c3ff4c_0\n", - "Linking tk-8.6.11-h27826a3_1\n", - "Linking zstd-1.5.0-ha95c52a_0\n", - "Linking sqlite-3.37.0-h9cd32fc_0\n", - "Linking libxml2-2.9.12-h885dcf4_1\n", - "Linking libssh2-1.10.0-ha56f1ee_2\n", - "Linking libsolv-0.7.19-h780b84a_5\n", - "Linking libnghttp2-1.43.0-h812cca2_1\n", - "Linking krb5-1.19.2-hcc1bbae_3\n", - "Linking python-3.9.7-hb7a2778_3_cpython\n", - "Linking libarchive-3.5.2-hccf745f_1\n", - "Linking libcurl-7.80.0-h2574ce0_0\n", - "Linking python_abi-3.9-2_cp39\n", - "Linking wheel-0.37.0-pyhd8ed1ab_1\n", - "Linking libmamba-0.19.0-h3985d26_0\n", - "Linking setuptools-59.4.0-py39hf3d152e_0\n", - "Linking pip-21.3.1-pyhd8ed1ab_0\n", - "Linking six-1.16.0-pyh6c4a22f_0\n", - "Linking idna-3.1-pyhd3deb0d_0\n", - "Linking libmambapy-0.19.0-py39h8bfa403_0\n", - "Linking ruamel_yaml-0.15.80-py39h3811e60_1006\n", - "Linking pysocks-1.7.1-py39hf3d152e_4\n", - "Linking pycosat-0.6.3-py39h3811e60_1009\n", - "Linking certifi-2021.10.8-py39hf3d152e_1\n", - "Linking pycparser-2.21-pyhd8ed1ab_0\n", - "Linking colorama-0.4.4-pyh9f0ad1d_0\n", - "Linking charset-normalizer-2.0.9-pyhd8ed1ab_0\n", - "Linking cffi-1.15.0-py39h4bc2ebd_0\n", - "Linking tqdm-4.62.3-pyhd8ed1ab_0\n", - "Linking cryptography-36.0.0-py39h95dcef6_0\n", - "Linking brotlipy-0.7.0-py39h3811e60_1003\n", - "Linking conda-package-handling-1.7.3-py39h3811e60_1\n", - "Linking pyopenssl-21.0.0-pyhd8ed1ab_0\n", - "Linking urllib3-1.26.7-pyhd8ed1ab_0\n", - "Linking requests-2.26.0-pyhd8ed1ab_1\n", - "Linking conda-4.11.0-py39hf3d152e_0\n", - "Linking mamba-0.19.0-py39hfa8f2c8_0\n", - "Transaction finished\n", - "installation finished.\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - " % Total % Received % Xferd Average Speed Time Time Time Current\n", - " Dload Upload Total Spent Left Speed\n", - "100 160 100 160 0 0 969 0 --:--:-- --:--:-- --:--:-- 969\n", - "100 665 100 665 0 0 2224 0 --:--:-- --:--:-- --:--:-- 2224\n", - "100 102M 100 102M 0 0 14.3M 0 0:00:07 0:00:07 --:--:-- 14.4M\n", - "bash: line 7: !cp: command not found\n" - ] - } - ], "source": [ - "%%bash\n", - "\n", - "#install Mamba\n", - "curl -L -O https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh\n", - "bash Mambaforge-$(uname)-$(uname -m).sh -b -p $HOME/mambaforge\n", - "rm Mamba*" + "## Learning objectives\n", + "+ Download genomic data from NCBI from the commnd line\n", + "+ Run pangolin to identify viral lineages\n", + "+ Generate a phylogeny to visualize lineage identity" ] }, { - "cell_type": "code", - "execution_count": 20, - "id": "a40f7ebc", + "cell_type": "markdown", "metadata": {}, - "outputs": [], "source": [ - "#move mamba executable to your path\n", - "!cp ~/mambaforge/bin/mamba /home/ec2-user/anaconda3/condabin" + "## Get started" ] }, { - "cell_type": "code", - "execution_count": 21, - "id": "f421805e", + "cell_type": "markdown", + "id": "03541941", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Requirement already satisfied: biopython in /home/ec2-user/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages (1.79)\n", - "Requirement already satisfied: numpy in /home/ec2-user/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages (from biopython) (1.19.5)\n", - "\u001b[33mWARNING: You are using pip version 21.2.4; however, version 21.3.1 is available.\n", - "You should consider upgrading via the '/home/ec2-user/anaconda3/envs/amazonei_mxnet_p36/bin/python -m pip install --upgrade pip' command.\u001b[0m\n" - ] - } - ], "source": [ - "#install biopython to import packages below\n", - "!pip install biopython" + "### Install packages" ] }, { "cell_type": "code", - "execution_count": 41, - "id": "fd936fd6", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Collecting package metadata (current_repodata.json): done\n", - "Solving environment: - \n", - "The environment is inconsistent, please check the package plan carefully\n", - "The following packages are causing the inconsistency:\n", - "\n", - " - conda-forge/noarch::seaborn-base==0.11.1=pyhd8ed1ab_1\n", - " - conda-forge/noarch::nbclassic==0.2.6=pyhd8ed1ab_0\n", - " - conda-forge/noarch::typing-extensions==3.7.4.3=0\n", - " - conda-forge/linux-64::pluggy==0.13.1=py36h5fab9bb_4\n", - " - conda-forge/linux-64::blaze==0.11.3=py36_0\n", - " - conda-forge/linux-64::matplotlib==3.3.4=py36h5fab9bb_0\n", - " - conda-forge/noarch::python-language-server==0.36.2=pyhd8ed1ab_0\n", - " - conda-forge/noarch::jupyterlab_server==2.3.0=pyhd8ed1ab_0\n", - " - conda-forge/noarch::pyls-black==0.4.6=pyh9f0ad1d_0\n", - " - conda-forge/linux-64::scikit-image==0.16.2=py36hb3f55d8_0\n", - " - conda-forge/noarch::path.py==12.5.0=0\n", - " - conda-forge/noarch::qdarkstyle==2.8.1=pyhd8ed1ab_2\n", - " - conda-forge/noarch::ipywidgets==7.6.3=pyhd3deb0d_0\n", - " - conda-forge/noarch::black==20.8b1=py_1\n", - " - conda-forge/linux-64::anyio==2.1.0=py36h5fab9bb_0\n", - " - conda-forge/linux-64::jupyter_server==1.4.1=py36h5fab9bb_0\n", - " - conda-forge/noarch::nbclient==0.5.2=pyhd8ed1ab_0\n", - " - conda-forge/linux-64::widgetsnbextension==3.5.1=py36h5fab9bb_4\n", - " - conda-forge/linux-64::bokeh==2.2.3=py36h5fab9bb_0\n", - " - conda-forge/linux-64::keyring==22.0.1=py36h5fab9bb_0\n", - " - conda-forge/linux-64::nbconvert==6.0.7=py36h5fab9bb_3\n", - " - conda-forge/noarch::numpydoc==1.1.0=py_1\n", - " - conda-forge/linux-64::spyder==4.2.0=py36h5fab9bb_0\n", - " - conda-forge/noarch::flake8==3.8.4=py_0\n", - " - conda-forge/noarch::pyls-spyder==0.3.2=pyhd8ed1ab_0\n", - " - conda-forge/noarch::nbformat==5.1.2=pyhd8ed1ab_1\n", - " - conda-forge/noarch::importlib_metadata==3.6.0=hd8ed1ab_0\n", - " - conda-forge/noarch::aioitertools==0.7.1=pyhd8ed1ab_0\n", - " - conda-forge/noarch::jupyterlab_launcher==0.13.1=py_2\n", - " - conda-forge/noarch::odo==0.5.1=py_1\n", - " - conda-forge/noarch::imageio==2.9.0=py_0\n", - " - conda-forge/noarch::helpdev==0.7.1=pyhd8ed1ab_0\n", - " - conda-forge/linux-64::path==15.1.2=py36h5fab9bb_0\n", - " - conda-forge/noarch::jsonschema==3.2.0=py_2\n", - " - conda-forge/linux-64::yarl==1.6.3=py36h8f6f2f9_1\n", - " - conda-forge/noarch::sphinx==3.5.1=pyhd8ed1ab_0\n", - " - conda-forge/noarch::seaborn==0.11.1=hd8ed1ab_1\n", - " - conda-forge/linux-64::jupyter==1.0.0=py36h5fab9bb_6\n", - " - conda-forge/linux-64::nb_conda==2.2.1=py36h5fab9bb_4\n", - " - conda-forge/noarch::dask==2021.2.0=pyhd8ed1ab_0\n", - " - conda-forge/linux-64::matplotlib-base==3.3.4=py36hd391965_0\n", - " - conda-forge/noarch::anaconda-client==1.7.2=py_0\n", - " - conda-forge/noarch::anaconda-project==0.9.1=pyhd8ed1ab_0\n", - " - conda-forge/linux-64::importlib-metadata==3.6.0=py36h5fab9bb_0\n", - " - conda-forge/linux-64::pytest==6.2.2=py36h5fab9bb_0\n", - "failed with initial frozen solve. Retrying with flexible solve.\n", - "Collecting package metadata (repodata.json): done\n", - "Solving environment: - \n", - "The environment is inconsistent, please check the package plan carefully\n", - "The following packages are causing the inconsistency:\n", - "\n", - " - conda-forge/noarch::seaborn-base==0.11.1=pyhd8ed1ab_1\n", - " - conda-forge/noarch::nbclassic==0.2.6=pyhd8ed1ab_0\n", - " - conda-forge/noarch::typing-extensions==3.7.4.3=0\n", - " - conda-forge/linux-64::pluggy==0.13.1=py36h5fab9bb_4\n", - " - conda-forge/linux-64::blaze==0.11.3=py36_0\n", - " - conda-forge/linux-64::matplotlib==3.3.4=py36h5fab9bb_0\n", - " - conda-forge/noarch::python-language-server==0.36.2=pyhd8ed1ab_0\n", - " - conda-forge/noarch::jupyterlab_server==2.3.0=pyhd8ed1ab_0\n", - " - conda-forge/noarch::pyls-black==0.4.6=pyh9f0ad1d_0\n", - " - conda-forge/linux-64::scikit-image==0.16.2=py36hb3f55d8_0\n", - " - conda-forge/noarch::path.py==12.5.0=0\n", - " - conda-forge/noarch::qdarkstyle==2.8.1=pyhd8ed1ab_2\n", - " - conda-forge/noarch::ipywidgets==7.6.3=pyhd3deb0d_0\n", - " - conda-forge/noarch::black==20.8b1=py_1\n", - " - conda-forge/linux-64::anyio==2.1.0=py36h5fab9bb_0\n", - " - conda-forge/linux-64::jupyter_server==1.4.1=py36h5fab9bb_0\n", - " - conda-forge/noarch::nbclient==0.5.2=pyhd8ed1ab_0\n", - " - conda-forge/linux-64::widgetsnbextension==3.5.1=py36h5fab9bb_4\n", - " - conda-forge/linux-64::bokeh==2.2.3=py36h5fab9bb_0\n", - " - conda-forge/linux-64::keyring==22.0.1=py36h5fab9bb_0\n", - " - conda-forge/linux-64::nbconvert==6.0.7=py36h5fab9bb_3\n", - " - conda-forge/noarch::numpydoc==1.1.0=py_1\n", - " - conda-forge/linux-64::spyder==4.2.0=py36h5fab9bb_0\n", - " - conda-forge/noarch::flake8==3.8.4=py_0\n", - " - conda-forge/noarch::pyls-spyder==0.3.2=pyhd8ed1ab_0\n", - " - conda-forge/noarch::nbformat==5.1.2=pyhd8ed1ab_1\n", - " - conda-forge/noarch::importlib_metadata==3.6.0=hd8ed1ab_0\n", - " - conda-forge/noarch::aioitertools==0.7.1=pyhd8ed1ab_0\n", - " - conda-forge/noarch::jupyterlab_launcher==0.13.1=py_2\n", - " - conda-forge/noarch::odo==0.5.1=py_1\n", - " - conda-forge/noarch::imageio==2.9.0=py_0\n", - " - conda-forge/noarch::helpdev==0.7.1=pyhd8ed1ab_0\n", - " - conda-forge/linux-64::path==15.1.2=py36h5fab9bb_0\n", - " - conda-forge/noarch::jsonschema==3.2.0=py_2\n", - " - conda-forge/linux-64::yarl==1.6.3=py36h8f6f2f9_1\n", - " - conda-forge/noarch::sphinx==3.5.1=pyhd8ed1ab_0\n", - " - conda-forge/noarch::seaborn==0.11.1=hd8ed1ab_1\n", - " - conda-forge/linux-64::jupyter==1.0.0=py36h5fab9bb_6\n", - " - conda-forge/linux-64::nb_conda==2.2.1=py36h5fab9bb_4\n", - " - conda-forge/noarch::dask==2021.2.0=pyhd8ed1ab_0\n", - " - conda-forge/linux-64::matplotlib-base==3.3.4=py36hd391965_0\n", - " - conda-forge/noarch::anaconda-client==1.7.2=py_0\n", - " - conda-forge/noarch::anaconda-project==0.9.1=pyhd8ed1ab_0\n", - " - conda-forge/linux-64::importlib-metadata==3.6.0=py36h5fab9bb_0\n", - " - conda-forge/linux-64::pytest==6.2.2=py36h5fab9bb_0\n", - "done\n", - "\n", - "\n", - "==> WARNING: A newer version of conda exists. <==\n", - " current version: 4.8.4\n", - " latest version: 4.11.0\n", - "\n", - "Please update conda by running\n", - "\n", - " $ conda update -n base -c defaults conda\n", - "\n", - "\n", - "\n", - "## Package Plan ##\n", - "\n", - " environment location: /home/ec2-user/anaconda3/envs/amazonei_mxnet_p36\n", - "\n", - " added / updated specs:\n", - " - ipyrad\n", - "\n", - "\n", - "The following packages will be downloaded:\n", - "\n", - " package | build\n", - " ---------------------------|-----------------\n", - " astroid-2.7.3 | py36h5fab9bb_0 330 KB conda-forge\n", - " bedtools-2.30.0 | h7d7f7ad_1 17.9 MB bioconda\n", - " cutadapt-3.4 | py36hc5360cc_1 197 KB bioconda\n", - " dataclasses-0.8 | pyh787bdff_2 22 KB conda-forge\n", - " dnaio-0.5.1 | py36hc5360cc_0 137 KB bioconda\n", - " flask-cors-3.0.10 | pyhd8ed1ab_0 15 KB conda-forge\n", - " fsspec-2021.11.1 | pyhd8ed1ab_0 91 KB conda-forge\n", - " htslib-1.11 | hd3b49d5_1 1.7 MB bioconda\n", - " jupyter_console-5.2.0 | py36_1 34 KB conda-forge\n", - " libdeflate-1.6 | h516909a_0 60 KB conda-forge\n", - " mpi4py-3.0.3 | py36he1a1962_7 696 KB conda-forge\n", - " notebook-6.3.0 | py36h5fab9bb_0 6.3 MB conda-forge\n", - " perl-5.32.1 | 0_h7f98852_perl5 14.5 MB conda-forge\n", - " pillow-8.2.0 | py36ha6010c0_1 688 KB conda-forge\n", - " platformdirs-2.3.0 | pyhd8ed1ab_0 14 KB conda-forge\n", - " pylint-2.10.2 | pyhd8ed1ab_0 255 KB conda-forge\n", - " pysam-0.16.0.1 | py36h4c34d4e_1 2.5 MB bioconda\n", - " python-isal-0.11.0 | py36h8f6f2f9_0 136 KB conda-forge\n", - " reportlab-3.5.68 | py36h3e18861_0 2.4 MB conda-forge\n", - " samtools-1.11 | h6270b1f_0 383 KB bioconda\n", - " typing_extensions-3.7.4.3 | py_0 25 KB conda-forge\n", - " vsearch-2.17.1 | h95f258a_0 1.4 MB bioconda\n", - " xopen-1.2.0 | py36h5fab9bb_0 22 KB conda-forge\n", - " ------------------------------------------------------------\n", - " Total: 49.7 MB\n", - "\n", - "The following NEW packages will be INSTALLED:\n", - "\n", - " arrow conda-forge/noarch::arrow-1.2.1-pyhd8ed1ab_0\n", - " astroid conda-forge/linux-64::astroid-2.7.3-py36h5fab9bb_0\n", - " bedtools bioconda/linux-64::bedtools-2.30.0-h7d7f7ad_1\n", - " bwa bioconda/linux-64::bwa-0.7.17-h5bf99c6_8\n", - " charset-normalizer conda-forge/noarch::charset-normalizer-2.0.9-pyhd8ed1ab_0\n", - " colorama conda-forge/noarch::colorama-0.4.4-pyh9f0ad1d_0\n", - " custom-inherit conda-forge/noarch::custom-inherit-2.4.0-pyhd8ed1ab_0\n", - " cutadapt bioconda/linux-64::cutadapt-3.4-py36hc5360cc_1\n", - " dataclasses conda-forge/noarch::dataclasses-0.8-pyh787bdff_2\n", - " dnaio bioconda/linux-64::dnaio-0.5.1-py36hc5360cc_0\n", - " docutils conda-forge/linux-64::docutils-0.16-py36h5fab9bb_3\n", - " flask-cors conda-forge/noarch::flask-cors-3.0.10-pyhd8ed1ab_0\n", - " fsspec conda-forge/noarch::fsspec-2021.11.1-pyhd8ed1ab_0\n", - " htslib bioconda/linux-64::htslib-1.11-hd3b49d5_1\n", - " ipyrad bioconda/noarch::ipyrad-0.9.81-pyh5e36f6f_0\n", - " isa-l conda-forge/linux-64::isa-l-2.30.0-ha770c72_4\n", - " jupyter_console conda-forge/linux-64::jupyter_console-5.2.0-py36_1\n", - " libdeflate conda-forge/linux-64::libdeflate-1.6-h516909a_0\n", - " mpi conda-forge/linux-64::mpi-1.0-openmpi\n", - " mpi4py conda-forge/linux-64::mpi4py-3.0.3-py36he1a1962_7\n", - " muscle bioconda/linux-64::muscle-3.8.1551-h7d875b9_6\n", - " notebook conda-forge/linux-64::notebook-6.3.0-py36h5fab9bb_0\n", - " openjpeg conda-forge/linux-64::openjpeg-2.4.0-hb52868f_1\n", - " openmpi conda-forge/linux-64::openmpi-4.1.1-hbfc84c5_0\n", - " pbzip2 conda-forge/linux-64::pbzip2-1.1.13-0\n", - " perl conda-forge/linux-64::perl-5.32.1-0_h7f98852_perl5\n", - " pigz conda-forge/linux-64::pigz-2.6-h27826a3_0\n", - " pillow conda-forge/linux-64::pillow-8.2.0-py36ha6010c0_1\n", - " platformdirs conda-forge/noarch::platformdirs-2.3.0-pyhd8ed1ab_0\n", - " pylint conda-forge/noarch::pylint-2.10.2-pyhd8ed1ab_0\n", - " pypng conda-forge/noarch::pypng-0.0.20-py_0\n", - " pysam bioconda/linux-64::pysam-0.16.0.1-py36h4c34d4e_1\n", - " python-isal conda-forge/linux-64::python-isal-0.11.0-py36h8f6f2f9_0\n", - " reportlab conda-forge/linux-64::reportlab-3.5.68-py36h3e18861_0\n", - " requests conda-forge/noarch::requests-2.26.0-pyhd8ed1ab_1\n", - " samtools bioconda/linux-64::samtools-1.11-h6270b1f_0\n", - " toyplot conda-forge/noarch::toyplot-0.19.0-pyh9f0ad1d_0\n", - " typing_extensions conda-forge/noarch::typing_extensions-3.7.4.3-py_0\n", - " urllib3 conda-forge/noarch::urllib3-1.26.7-pyhd8ed1ab_0\n", - " vsearch bioconda/linux-64::vsearch-2.17.1-h95f258a_0\n", - " xopen conda-forge/linux-64::xopen-1.2.0-py36h5fab9bb_0\n", - "\n", - "The following packages will be DOWNGRADED:\n", - "\n", - " libgcc-ng 11.2.0-h1d223b6_11 --> 9.3.0-h2828fa1_18\n", - " libgomp 11.2.0-h1d223b6_11 --> 9.3.0-h2828fa1_18\n", - " openssl 1.1.1l-h7f98852_0 --> 1.1.1k-h7f98852_0\n", - "\n", - "\n", - "\n", - "Downloading and Extracting Packages\n", - "reportlab-3.5.68 | 2.4 MB | ##################################### | 100% \n", - "dnaio-0.5.1 | 137 KB | ##################################### | 100% \n", - "htslib-1.11 | 1.7 MB | ##################################### | 100% \n", - "cutadapt-3.4 | 197 KB | ##################################### | 100% \n", - "libdeflate-1.6 | 60 KB | ##################################### | 100% \n", - "flask-cors-3.0.10 | 15 KB | ##################################### | 100% \n", - "typing_extensions-3. | 25 KB | ##################################### | 100% \n", - "samtools-1.11 | 383 KB | ##################################### | 100% \n", - "fsspec-2021.11.1 | 91 KB | ##################################### | 100% \n", - "bedtools-2.30.0 | 17.9 MB | ##################################### | 100% \n", - "perl-5.32.1 | 14.5 MB | ##################################### | 100% \n", - "python-isal-0.11.0 | 136 KB | ##################################### | 100% \n", - "dataclasses-0.8 | 22 KB | ##################################### | 100% \n", - "pillow-8.2.0 | 688 KB | ##################################### | 100% \n", - "astroid-2.7.3 | 330 KB | ##################################### | 100% \n", - "pylint-2.10.2 | 255 KB | ##################################### | 100% \n", - "pysam-0.16.0.1 | 2.5 MB | ##################################### | 100% \n", - "vsearch-2.17.1 | 1.4 MB | ##################################### | 100% \n", - "jupyter_console-5.2. | 34 KB | ##################################### | 100% \n", - "xopen-1.2.0 | 22 KB | ##################################### | 100% \n", - "mpi4py-3.0.3 | 696 KB | ##################################### | 100% \n", - "platformdirs-2.3.0 | 14 KB | ##################################### | 100% \n", - "notebook-6.3.0 | 6.3 MB | ##################################### | 100% \n", - "Preparing transaction: done\n", - "Verifying transaction: done\n", - "Executing transaction: - \n", - "For Linux 64, Open MPI is built with CUDA awareness but this support is disabled by default.\n", - "To enable it, please set the environment variable OMPI_MCA_opal_cuda_support=true before\n", - "launching your MPI processes. Equivalently, you can set the MCA parameter in the command line:\n", - "mpiexec --mca opal_cuda_support 1 ...\n", - " \n", - "In addition, the UCX support is also built but disabled by default.\n", - "To enable it, first install UCX (conda install -c conda-forge ucx). Then, set the environment\n", - "variables OMPI_MCA_pml=\"ucx\" OMPI_MCA_osc=\"ucx\" before launching your MPI processes.\n", - "Equivalently, you can set the MCA parameters in the command line:\n", - "mpiexec --mca pml ucx --mca osc ucx ...\n", - "Note that you might also need to set UCX_MEMTYPE_CACHE=n for CUDA awareness via UCX.\n", - "Please consult UCX's documentation for detail.\n", - " \n", - "\n", - "done\n" - ] - } - ], - "source": [ - "!conda install ipyrad -y -c conda-forge -c bioconda" - ] - }, - { - "cell_type": "markdown", - "id": "2d0f27ee", + "execution_count": null, + "id": "f994b990", "metadata": {}, + "outputs": [], "source": [ - "Now we want to create a conda/mamba env that has all of our necessary dependencies" + "#change this depending on how many threads are available in your notebook\n", + "CPU=4" ] }, { "cell_type": "code", - "execution_count": 23, - "id": "4ba6fae7", + "execution_count": null, + "id": "a19b662e", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "name: pangolin\n", - "channels:\n", - " - bioconda\n", - " - conda-forge\n", - " - defaults\n", - " - eaton-lab\n", - " \n", - "dependencies:\n", - " - sra-tools\n", - " - ipyrad\n", - " - toytree\n", - " - pangolin\n", - " - iqtree\n" - ] - } - ], + "outputs": [], "source": [ - "#you can look at the yaml file that specifies which programs we want to install\n", - "#you can also specify specific versions, here we just use the latest conda versionå\n", - "#for example, - sra-tools=2.11.0\n", - "!cat pangolin.yaml" + "! curl -L -O https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh\n", + "! bash Mambaforge-$(uname)-$(uname -m).sh -b -p $HOME/mambaforge" ] }, { "cell_type": "code", - "execution_count": 25, - "id": "49a20dc5", + "execution_count": null, + "id": "a40f7ebc", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "usage: mamba [-h] {create,export,list,remove,update,config} ...\n", - "mamba: error: unrecognized arguments: -y\n" - ] - } - ], + "outputs": [], "source": [ - "#create the environment. Here we use mamba because it is faster than conda\n", - "!mamba env create -f pangolin.yaml -y" + "#add to your path\n", + "import os\n", + "os.environ[\"PATH\"] += os.pathsep + os.environ[\"HOME\"]+\"/mambaforge/bin\"" ] }, { "cell_type": "code", - "execution_count": 33, - "id": "fd23abbd", + "execution_count": null, + "id": "f421805e", "metadata": {}, "outputs": [], "source": [ - "#give it the whole path to the env because otherwise it can't find the env\n", - "#if you want to play with it add a cell and type 'conda activate pangolin' \n", - "#or 'source activate pangolin'\n", - "!source activate /home/ec2-user/mambaforge/envs/pangolin\n", - "#!mamba info --envs" + "#install biopython to import packages below\n", + "! pip install biopython" ] }, { "cell_type": "code", - "execution_count": 35, - "id": "96dd7966", + "execution_count": null, + "id": "fd936fd6", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - " __ __ __ __\n", - " / \\ / \\ / \\ / \\\n", - " / \\/ \\/ \\/ \\\n", - "███████████████/ /██/ /██/ /██/ /████████████████████████\n", - " / / \\ / \\ / \\ / \\ \\____\n", - " / / \\_/ \\_/ \\_/ \\ o \\__,\n", - " / _/ \\_____/ `\n", - " |/\n", - " ███╗ ███╗ █████╗ ███╗ ███╗██████╗ █████╗\n", - " ████╗ ████║██╔══██╗████╗ ████║██╔══██╗██╔══██╗\n", - " ██╔████╔██║███████║██╔████╔██║██████╔╝███████║\n", - " ██║╚██╔╝██║██╔══██║██║╚██╔╝██║██╔══██╗██╔══██║\n", - " ██║ ╚═╝ ██║██║ ██║██║ ╚═╝ ██║██████╔╝██║ ██║\n", - " ╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝╚═════╝ ╚═╝ ╚═╝\n", - "\n", - " mamba (0.19.0) supported by @QuantStack\n", - "\n", - " GitHub: https://github.com/mamba-org/mamba\n", - " Twitter: https://twitter.com/QuantStack\n", - "\n", - "█████████████████████████████████████████████████████████████\n", - "\n", - "\n", - "Looking for: ['iqtree']\n", - "\n", - "bioconda/linux-64 Using cache\n", - "bioconda/noarch Using cache\n", - "conda-forge/linux-64 Using cache\n", - "conda-forge/noarch Using cache\n", - "pkgs/main/linux-64 Using cache\n", - "pkgs/main/noarch Using cache\n", - "pkgs/r/linux-64 Using cache\n", - "pkgs/r/noarch Using cache\n", - "\n", - "Pinned packages:\n", - " - python 3.6.*\n", - "\n", - "\n", - "Transaction\n", - "\n", - " Prefix: /home/ec2-user/anaconda3/envs/amazonei_mxnet_p36\n", - "\n", - " All requested packages already installed\n", - "\n" - ] - } - ], + "outputs": [], "source": [ - "!mamba install -c bioconda iqtree -y" + "! mamba install ipyrad iqtree -c conda-forge -c bioconda" ] }, { "cell_type": "code", - "execution_count": 40, + "execution_count": null, "id": "5a99cf0d", "metadata": {}, - "outputs": [ - { - "ename": "ModuleNotFoundError", - "evalue": "No module named 'ipyrad'", - "output_type": "error", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mBio\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mSeqIO\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mBio\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mEntrez\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 5\u001b[0;31m \u001b[0;32mimport\u001b[0m \u001b[0mipyrad\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0manalysis\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mipa\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 6\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mtoytree\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'ipyrad'" - ] - } - ], + "outputs": [], "source": [ "#import libraries\n", "import os\n", @@ -825,33 +121,14 @@ }, { "cell_type": "markdown", - "id": "dc694629", "metadata": {}, "source": [ - "### Set up your directory structure and remove files from previous runs if they exist" + "### Set up directory structure" ] }, { "cell_type": "code", - "execution_count": 28, - "id": "0f0e81f3", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "/home/jupyter/cloud-lab-training/GCP/notebooks/pangolin\n" - ] - } - ], - "source": [ - "cd /home/jupyter/cloud-lab-training/GCP/notebooks/pangolin/" - ] - }, - { - "cell_type": "code", - "execution_count": 29, + "execution_count": null, "id": "8f831fca", "metadata": {}, "outputs": [], @@ -863,19 +140,10 @@ }, { "cell_type": "code", - "execution_count": 30, + "execution_count": null, "id": "6423ca5d", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "rm: cannot remove 'sarscov2_*': No such file or directory\n", - "rm: cannot remove 'lineage_report.csv': No such file or directory\n" - ] - } - ], + "outputs": [], "source": [ "if os.path.exists('sarscov2_sequences.fasta'):\n", " os.remove('sarscov2_sequences.fasta')\n", @@ -893,18 +161,10 @@ }, { "cell_type": "code", - "execution_count": 31, + "execution_count": null, "id": "16824bcf", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "the number of sequences we will analyze = 18\n" - ] - } - ], + "outputs": [], "source": [ "#give a list of accession number for covid sequences\n", "acc_nums=['NC_045512','LR757995','LR757996','OL698718','OL677199','OL672836','MZ914912','MZ916499','MZ908464','MW580573','MW580574','MW580576','MW991906','MW931310','MW932027','MW424864','MW453109','MW453110']\n", @@ -921,35 +181,10 @@ }, { "cell_type": "code", - "execution_count": 32, + "execution_count": null, "id": "a28a7122", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Saved NC_045512\n", - "Saved LR757995\n", - "Saved LR757996\n", - "Saved OL698718\n", - "Saved OL677199\n", - "Saved OL672836\n", - "Saved MZ914912\n", - "Saved MZ916499\n", - "Saved MZ908464\n", - "Saved MW580573\n", - "Saved MW580574\n", - "Saved MW580576\n", - "Saved MW991906\n", - "Saved MW931310\n", - "Saved MW932027\n", - "Saved MW424864\n", - "Saved MW453109\n", - "Saved MW453110\n" - ] - } - ], + "outputs": [], "source": [ "#use the bio.entrez toolkit within biopython to download the accession numbers\n", "#save those sequences to a single fasta file\n", @@ -970,51 +205,25 @@ }, { "cell_type": "code", - "execution_count": 15, + "execution_count": null, "id": "56acb7cc", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "the number of seqs in our fasta file: \n", - "18\n" - ] - } - ], + "outputs": [], "source": [ "#make sure our fasta file has the same number of seqs as the acc_nums list\n", "print('the number of seqs in our fasta file: ')\n", - "!grep '>' sarscov2_seqs.fasta | wc -l" + "! grep '>' sarscov2_seqs.fasta | wc -l" ] }, { "cell_type": "code", - "execution_count": 16, + "execution_count": null, "id": "8606c352", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - ">NC_045512.2 Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome\n", - "ATTAAAGGTTTATACCTTCCCAGGTAACAAACCAACCAACTTTCGATCTCTTGTAGATCTGTTCTCTAAA\n", - "CGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAGTGCACTCACGCAGTATAATTAATAAC\n", - "TAATTACTGTCGTTGACAGGACACGAGTAACTCGTCTATCTTCTGCAGGCTGCTTACGGTTTCGTCCGTG\n", - "TTGCAGCCGATCATCAGCACATCTAGGTTTCGTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTC\n", - "CCTGGTTTCAACGAGAAAACACACGTCCAACTCAGTTTGCCTGTTTTACAGGTTCGCGACGTGCTCGTAC\n", - "GTGGCTTTGGAGACTCCGTGGAGGAGGTCTTATCAGAGGCACGTCAACATCTTAAAGATGGCACTTGTGG\n", - "CTTAGTAGAAGTTGAAAAAGGCGTTTTGCCTCAACTTGAACAGCCCTATGTGTTCATCAAACGTTCGGAT\n", - "GCTCGAACTGCACCTCATGGTCATGTTATGGTTGAGCTGGTAGCAGAACTCGAAGGCATTCAGTACGGTC\n", - "GTAGTGGTGAGACACTTGGTGTCCTTGTCCCTCATGTGGGCGAAATACCAGTGGCTTACCGCAAGGTTCT\n" - ] - } - ], + "outputs": [], "source": [ "#let's peek at our new fasta file\n", - "!head sarscov2_seqs.fasta" + "! head sarscov2_seqs.fasta" ] }, { @@ -1030,49 +239,12 @@ }, { "cell_type": "code", - "execution_count": 33, + "execution_count": null, "id": "f1a17a74", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\u001b[32mAll dependencies satisfied.\u001b[0m\n", - "\u001b[32mThe query file is:\u001b[0m/home/jupyter/cloud-lab-training/GCP/notebooks/pangolin/pangolin_analysis/sarscov2_seqs.fasta\n", - "\u001b[32m** Running sequence QC **\u001b[0m\n", - "\u001b[32mNumber of sequences detected: \u001b[0m18\n", - "\u001b[32mTotal passing QC: \u001b[0m18\n", - "\u001b[32m\n", - "Data files found:\u001b[0m\n", - "Trained model:\t/opt/conda/lib/python3.7/site-packages/pangoLEARN/data/decisionTree_v1.joblib\n", - "Header file:\t/opt/conda/lib/python3.7/site-packages/pangoLEARN/data/decisionTreeHeaders_v1.joblib\n", - "Designated hash:\t/opt/conda/lib/python3.7/site-packages/pangoLEARN/data/lineages.hash.csv\n", - "\u001b[33mJob stats:\n", - "job count min threads max threads\n", - "-------------------- ------- ------------- -------------\n", - "add_failed_seqs 1 1 1\n", - "align_to_reference 1 1 1\n", - "all 1 1 1\n", - "generate_report 1 1 1\n", - "get_constellations 1 1 1\n", - "hash_sequence_assign 1 1 1\n", - "pangolearn 1 1 1\n", - "scorpio 1 4 4\n", - "total 8 1 4\n", - "\u001b[0m\n", - "loading model 12/04/2021, 00:00:50\n", - "/opt/conda/lib/python3.7/site-packages/sklearn/base.py:334: UserWarning: Trying to unpickle estimator DecisionTreeClassifier from version 0.24.2 when using version 0.23.1. This might lead to breaking code or invalid results. Use at your own risk.\n", - " UserWarning)\n", - "processing block of 6 sequences 12/04/2021, 00:00:51\n", - "complete 12/04/2021, 00:00:51\n", - "\u001b[32mOutput file written to: \u001b[0m/home/jupyter/cloud-lab-training/GCP/notebooks/pangolin/pangolin_analysis/lineage_report.csv\n", - "\u001b[32mOutput alignment written to: \u001b[0m/home/jupyter/cloud-lab-training/GCP/notebooks/pangolin/pangolin_analysis/sequences.aln.fasta\n" - ] - } - ], + "outputs": [], "source": [ - "!pangolin sarscov2_seqs.fasta --alignment --threads $CPU" + "! pangolin sarscov2_seqs.fasta --alignment --threads $CPU" ] }, { @@ -1094,144 +266,13 @@ }, { "cell_type": "code", - "execution_count": 18, + "execution_count": null, "id": "f2782855", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "IQ-TREE multicore version 2.1.4-beta COVID-edition for Linux 64-bit built Jun 24 2021\n", - "Developed by Bui Quang Minh, James Barbetti, Nguyen Lam Tung,\n", - "Olga Chernomor, Heiko Schmidt, Dominik Schrempf, Michael Woodhams.\n", - "\n", - "Host: cloud-lab-notebook (AVX2, FMA3, 14 GB RAM)\n", - "Command: iqtree -s sequences.aln.fasta -nt 4 -m HKY --prefix sarscov2_tree --redo-tree\n", - "Seed: 719057 (Using SPRNG - Scalable Parallel Random Number Generator)\n", - "Time: Fri Dec 3 23:53:05 2021\n", - "Kernel: AVX+FMA - 4 threads (4 CPU cores detected)\n", - "\n", - "Reading alignment file sequences.aln.fasta ... Fasta format detected\n", - "Alignment most likely contains DNA/RNA sequences\n", - "WARNING: 494 sites contain only gaps or ambiguous characters.\n", - "Alignment has 18 sequences with 29903 columns, 193 distinct patterns\n", - "109 parsimony-informative, 33 singleton sites, 29761 constant sites\n", - "WARNING: Some sequence names are changed as follows:\n", - "LR757995.1_Severe_acute_respiratory_syndrome_coronavirus_2_genome_assembly__chromosome:_whole_genome -> LR757995.1_Severe_acute_respiratory_syndrome_coronavirus_2_genome_assembly__chromosome__whole_genome\n", - "LR757996.1_Severe_acute_respiratory_syndrome_coronavirus_2_genome_assembly__chromosome:_whole_genome -> LR757996.1_Severe_acute_respiratory_syndrome_coronavirus_2_genome_assembly__chromosome__whole_genome\n", - "OL698718.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/MN-MDH-18236/2021_ORF1ab_polyprotein_(ORF1ab)__ORF1a_polyprotein_(ORF1ab)__surface_glycoprotein_(S)__ORF3a_protein_(ORF3a)__envelope_protein_(E)__membrane_glycoprotein_(M)__ORF6_protein_(ORF6)__ORF7a_protein_(ORF7a)__ORF7b_(ORF7b)__ORF8_protein_(ORF8)__nucleocapsid_phosphoprotein_(N)__and_ORF10_protein_(ORF10)_genes__complete_cds -> OL698718.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/MN-MDH-18236/2021_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___ORF7b__ORF7b___ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cds\n", - "OL677199.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/CAN/ON-NML-249359/2021_ORF1ab_polyprotein_(ORF1ab)__ORF1a_polyprotein_(ORF1ab)__surface_glycoprotein_(S)__ORF3a_protein_(ORF3a)__envelope_protein_(E)__membrane_glycoprotein_(M)__ORF6_protein_(ORF6)__and_ORF7a_protein_(ORF7a)_genes__complete_cds;_ORF7b_gene__complete_sequence;_and_ORF8_protein_(ORF8)__nucleocapsid_phosphoprotein_(N)__and_ORF10_protein_(ORF10)_genes__complete_cds -> OL677199.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/CAN/ON-NML-249359/2021_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___and_ORF7a_protein__ORF7a__genes__complete_cds__ORF7b_gene__complete_sequence__and_ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cds\n", - "MZ914912.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/TG796484/2020_ORF1ab_polyprotein_(ORF1ab)__ORF1a_polyprotein_(ORF1ab)__surface_glycoprotein_(S)__ORF3a_protein_(ORF3a)__envelope_protein_(E)__membrane_glycoprotein_(M)__ORF6_protein_(ORF6)__ORF7a_protein_(ORF7a)__ORF7b_(ORF7b)__ORF8_protein_(ORF8)__nucleocapsid_phosphoprotein_(N)__and_ORF10_protein_(ORF10)_genes__complete_cds -> MZ914912.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/TG796484/2020_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___ORF7b__ORF7b___ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cds\n", - "MZ916499.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/TG841289/2020_ORF1ab_polyprotein_(ORF1ab)_and_ORF1a_polyprotein_(ORF1ab)_genes__partial_cds;_and_surface_glycoprotein_(S)__ORF3a_protein_(ORF3a)__envelope_protein_(E)__membrane_glycoprotein_(M)__ORF6_protein_(ORF6)__ORF7a_protein_(ORF7a)__ORF7b_(ORF7b)__ORF8_protein_(ORF8)__nucleocapsid_phosphoprotein_(N)__and_ORF10_protein_(ORF10)_genes__complete_cds -> MZ916499.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/TG841289/2020_ORF1ab_polyprotein__ORF1ab__and_ORF1a_polyprotein__ORF1ab__genes__partial_cds__and_surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___ORF7b__ORF7b___ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cds\n", - "MZ908464.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/TG769681/2020_ORF1ab_polyprotein_(ORF1ab)__ORF1a_polyprotein_(ORF1ab)__surface_glycoprotein_(S)__ORF3a_protein_(ORF3a)__envelope_protein_(E)__membrane_glycoprotein_(M)__and_ORF6_protein_(ORF6)_genes__complete_cds;_ORF7a_protein_(ORF7a)_and_ORF7b_(ORF7b)_genes__partial_cds;_and_ORF8_protein_(ORF8)__nucleocapsid_phosphoprotein_(N)__and_ORF10_protein_(ORF10)_genes__complete_cds -> MZ908464.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/TG769681/2020_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___and_ORF6_protein__ORF6__genes__complete_cds__ORF7a_protein__ORF7a__and_ORF7b__ORF7b__genes__partial_cds__and_ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cds\n", - "MW580573.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/MD-MDH-0830/2021_ORF1ab_polyprotein_(ORF1ab)__ORF1a_polyprotein_(ORF1ab)__surface_glycoprotein_(S)__ORF3a_protein_(ORF3a)__envelope_protein_(E)__membrane_glycoprotein_(M)__ORF6_protein_(ORF6)__ORF7a_protein_(ORF7a)__ORF7b_(ORF7b)__ORF8_protein_(ORF8)__nucleocapsid_phosphoprotein_(N)__and_ORF10_protein_(ORF10)_genes__complete_cds -> MW580573.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/MD-MDH-0830/2021_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___ORF7b__ORF7b___ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cds\n", - "MW991906.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/CA-CDC-FG-021330/2021_ORF1ab_polyprotein_(ORF1ab)_and_ORF1a_polyprotein_(ORF1ab)_genes__partial_cds;_surface_glycoprotein_(S)__ORF3a_protein_(ORF3a)__envelope_protein_(E)__membrane_glycoprotein_(M)__ORF6_protein_(ORF6)__ORF7a_protein_(ORF7a)__and_ORF7b_(ORF7b)_genes__complete_cds;_ORF8_gene__complete_sequence;_and_nucleocapsid_phosphoprotein_(N)_and_ORF10_protein_(ORF10)_genes__complete_cds -> MW991906.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/CA-CDC-FG-021330/2021_ORF1ab_polyprotein__ORF1ab__and_ORF1a_polyprotein__ORF1ab__genes__partial_cds__surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___and_ORF7b__ORF7b__genes__complete_cds__ORF8_gene__complete_sequence__and_nucleocapsid_phosphoprotein__N__and_ORF10_protein__ORF10__genes__complete_cds\n", - "MW932027.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/MA-CDC-STM-000044850/2021_ORF1ab_polyprotein_(ORF1ab)__ORF1a_polyprotein_(ORF1ab)__surface_glycoprotein_(S)__ORF3a_protein_(ORF3a)__envelope_protein_(E)__membrane_glycoprotein_(M)__ORF6_protein_(ORF6)__ORF7a_protein_(ORF7a)__and_ORF7b_(ORF7b)_genes__complete_cds;_ORF8_gene__complete_sequence;_and_nucleocapsid_phosphoprotein_(N)_and_ORF10_protein_(ORF10)_genes__complete_cds -> MW932027.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/MA-CDC-STM-000044850/2021_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___and_ORF7b__ORF7b__genes__complete_cds__ORF8_gene__complete_sequence__and_nucleocapsid_phosphoprotein__N__and_ORF10_protein__ORF10__genes__complete_cds\n", - "\n", - " Gap/Ambiguity Composition p-value\n", - " 1 NC_045512.2_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_Wuhan-Hu-1__complete_genome 1.65% passed 99.98%\n", - " 2 LR757995.1_Severe_acute_respiratory_syndrome_coronavirus_2_genome_assembly__chromosome__whole_genome 1.65% passed 99.98%\n", - " 3 LR757996.1_Severe_acute_respiratory_syndrome_coronavirus_2_genome_assembly__chromosome__whole_genome 1.65% passed 99.98%\n", - " 4 OL698718.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/MN-MDH-18236/2021_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___ORF7b__ORF7b___ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cds 3.28% passed 99.65%\n", - " 5 OL677199.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/CAN/ON-NML-249359/2021_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___and_ORF7a_protein__ORF7a__genes__complete_cds__ORF7b_gene__complete_sequence__and_ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cds 1.82% passed 99.91%\n", - " 6 OL672836.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/BEL/rega-20174/2021__complete_genome 1.78% passed 99.96%\n", - " 7 MZ914912.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/TG796484/2020_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___ORF7b__ORF7b___ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cds 2.69% passed 99.99%\n", - " 8 MZ916499.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/TG841289/2020_ORF1ab_polyprotein__ORF1ab__and_ORF1a_polyprotein__ORF1ab__genes__partial_cds__and_surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___ORF7b__ORF7b___ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cds 7.75% passed 98.28%\n", - " 9 MZ908464.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/TG769681/2020_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___and_ORF6_protein__ORF6__genes__complete_cds__ORF7a_protein__ORF7a__and_ORF7b__ORF7b__genes__partial_cds__and_ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cds 5.93% passed 96.00%\n", - " 10 MW580573.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/MD-MDH-0830/2021_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___ORF7b__ORF7b___ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cds 2.26% passed 99.99%\n", - " 11 MW580574.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/MD-MDH-0831/2021__complete_genome 2.02% passed 99.95%\n", - " 12 MW580576.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/MD-MDH-0833/2021__complete_genome 1.98% passed 99.93%\n", - " 13 MW991906.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/CA-CDC-FG-021330/2021_ORF1ab_polyprotein__ORF1ab__and_ORF1a_polyprotein__ORF1ab__genes__partial_cds__surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___and_ORF7b__ORF7b__genes__complete_cds__ORF8_gene__complete_sequence__and_nucleocapsid_phosphoprotein__N__and_ORF10_protein__ORF10__genes__complete_cds 2.19% passed 99.82%\n", - " 14 MW931310.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/IN-CDC-STM-000045992/2021__complete_genome 1.68% passed 100.00%\n", - " 15 MW932027.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/MA-CDC-STM-000044850/2021_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___and_ORF7b__ORF7b__genes__complete_cds__ORF8_gene__complete_sequence__and_nucleocapsid_phosphoprotein__N__and_ORF10_protein__ORF10__genes__complete_cds 1.70% passed 99.98%\n", - " 16 MW424864.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/CA-LACPHL-AF00051/2020__complete_genome 1.91% passed 99.99%\n", - " 17 MW453109.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/CA-LACPHL-AF00094/2020__complete_genome 2.13% passed 99.99%\n", - " 18 MW453110.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/CA-LACPHL-AF00093/2020__complete_genome 1.93% passed 99.98%\n", - "**** TOTAL 2.56% 0 sequences failed composition chi2 test (p-value<5%; df=3)\n", - "NOTE: LR757996.1_Severe_acute_respiratory_syndrome_coronavirus_2_genome_assembly__chromosome__whole_genome is identical to NC_045512.2_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_Wuhan-Hu-1__complete_genome but kept for subsequent analysis\n", - "Creating fast initial parsimony tree by random order stepwise addition...\n", - "0.003 seconds, parsimony score: 154 (based on 142 sites)\n", - "\n", - "NOTE: 0 MB RAM (0 GB) is required!\n", - "WARNING: Number of threads seems too high for short alignments. Use -T AUTO to determine best number of threads.\n", - "Estimate model parameters (epsilon = 0.100)\n", - "1. Initial log-likelihood: -41361.570\n", - "2. Current log-likelihood: -41335.547\n", - "3. Current log-likelihood: -41330.199\n", - "4. Current log-likelihood: -41321.132\n", - "5. Current log-likelihood: -41320.973\n", - "Optimal log-likelihood: -41320.963\n", - "Rate parameters: A-C: 1.00000 A-G: 4.58202 A-T: 1.00000 C-G: 1.00000 C-T: 4.58202 G-T: 1.00000\n", - "Base frequencies: A: 0.299 C: 0.183 G: 0.196 T: 0.322\n", - "Parameters optimization took 5 rounds (0.015 sec)\n", - "Computing ML distances based on estimated model parameters...\n", - "Computing ML distances took 0.001753 sec (of wall-clock time) 0.005914 sec(of CPU time)\n", - "Computing RapidNJ tree took 0.009550 sec (of wall-clock time) 0.009811 sec (of CPU time)\n", - "Log-likelihood of RapidNJ tree: -41382.805\n", - "--------------------------------------------------------------------\n", - "| INITIALIZING CANDIDATE TREE SET |\n", - "--------------------------------------------------------------------\n", - "Generating 98 parsimony trees... 0.079 second\n", - "Computing log-likelihood of 98 initial trees ... 0.073 seconds\n", - "Current best score: -41316.721\n", - "\n", - "Do NNI search on 20 best initial trees\n", - "Estimate model parameters (epsilon = 0.100)\n", - "BETTER TREE FOUND at iteration 1: -41316.682\n", - "Iteration 10 / LogL: -41316.682 / Time: 0h:0m:0s\n", - "Iteration 20 / LogL: -41316.682 / Time: 0h:0m:0s\n", - "Finish initializing candidate tree set (1)\n", - "Current best tree score: -41316.682 / CPU time: 0.246\n", - "Number of iterations: 20\n", - "--------------------------------------------------------------------\n", - "| OPTIMIZING CANDIDATE TREE SET |\n", - "--------------------------------------------------------------------\n", - "UPDATE BEST LOG-LIKELIHOOD: -41316.682\n", - "Iteration 30 / LogL: -41316.732 / Time: 0h:0m:0s (0h:0m:0s left)\n", - "Iteration 40 / LogL: -41316.719 / Time: 0h:0m:0s (0h:0m:0s left)\n", - "Iteration 50 / LogL: -41316.716 / Time: 0h:0m:0s (0h:0m:0s left)\n", - "Iteration 60 / LogL: -41341.534 / Time: 0h:0m:0s (0h:0m:0s left)\n", - "Iteration 70 / LogL: -41316.803 / Time: 0h:0m:0s (0h:0m:0s left)\n", - "UPDATE BEST LOG-LIKELIHOOD: -41316.682\n", - "Iteration 80 / LogL: -41327.750 / Time: 0h:0m:0s (0h:0m:0s left)\n", - "Iteration 90 / LogL: -41316.734 / Time: 0h:0m:1s (0h:0m:0s left)\n", - "UPDATE BEST LOG-LIKELIHOOD: -41316.682\n", - "Iteration 100 / LogL: -41316.803 / Time: 0h:0m:1s (0h:0m:0s left)\n", - "TREE SEARCH COMPLETED AFTER 102 ITERATIONS / Time: 0h:0m:1s\n", - "\n", - "--------------------------------------------------------------------\n", - "| FINALIZING TREE SEARCH |\n", - "--------------------------------------------------------------------\n", - "Performs final model parameters optimization\n", - "Estimate model parameters (epsilon = 0.010)\n", - "1. Initial log-likelihood: -41316.682\n", - "Optimal log-likelihood: -41316.677\n", - "Rate parameters: A-C: 1.00000 A-G: 4.46795 A-T: 1.00000 C-G: 1.00000 C-T: 4.46795 G-T: 1.00000\n", - "Base frequencies: A: 0.299 C: 0.183 G: 0.196 T: 0.322\n", - "Parameters optimization took 1 rounds (0.002 sec)\n", - "BEST SCORE FOUND : -41316.677\n", - "Total tree length: 0.005\n", - "\n", - "Total number of iterations: 102\n", - "CPU time used for tree search: 4.260 sec (0h:0m:4s)\n", - "Wall-clock time used for tree search: 1.142 sec (0h:0m:1s)\n", - "Total CPU time used: 4.401 sec (0h:0m:4s)\n", - "Total wall-clock time used: 1.190 sec (0h:0m:1s)\n", - "\n", - "Analysis results written to: \n", - " IQ-TREE report: sarscov2_tree.iqtree\n", - " Maximum-likelihood tree: sarscov2_tree.treefile\n", - " Likelihood distances: sarscov2_tree.mldist\n", - " Screen log file: sarscov2_tree.log\n", - "\n", - "Date and Time: Fri Dec 3 23:53:06 2021\n" - ] - } - ], + "outputs": [], "source": [ "#run iqtree with threads = $CPU variable, if you exclude the -m it will do a phylogenetic model search before tree search\n", - "!iqtree -s sequences.aln.fasta -nt $CPU -m HKY --prefix sarscov2_tree --redo-tree" + "! iqtree -s sequences.aln.fasta -nt $CPU -m HKY --prefix sarscov2_tree --redo-tree" ] }, { @@ -1244,7 +285,7 @@ }, { "cell_type": "code", - "execution_count": 19, + "execution_count": null, "id": "cef2ba18", "metadata": {}, "outputs": [], @@ -1255,23 +296,10 @@ }, { "cell_type": "code", - "execution_count": 23, + "execution_count": null, "id": "842af165", "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
MW453110.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/CA-LACPHL-AF00093/2020__complete_genomeMW453109.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/CA-LACPHL-AF00094/2020__complete_genomeMW424864.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/CA-LACPHL-AF00051/2020__complete_genomeMW580576.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/MD-MDH-0833/2021__complete_genomeMW580574.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/MD-MDH-0831/2021__complete_genomeMW580573.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/MD-MDH-0830/2021_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___ORF7b__ORF7b___ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cdsMZ908464.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/TG769681/2020_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___and_ORF6_protein__ORF6__genes__complete_cds__ORF7a_protein__ORF7a__and_ORF7b__ORF7b__genes__partial_cds__and_ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cdsMZ914912.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/TG796484/2020_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___ORF7b__ORF7b___ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cdsMZ916499.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/TG841289/2020_ORF1ab_polyprotein__ORF1ab__and_ORF1a_polyprotein__ORF1ab__genes__partial_cds__and_surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___ORF7b__ORF7b___ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cdsLR757996.1_Severe_acute_respiratory_syndrome_coronavirus_2_genome_assembly__chromosome__whole_genomeNC_045512.2_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_Wuhan-Hu-1__complete_genomeLR757995.1_Severe_acute_respiratory_syndrome_coronavirus_2_genome_assembly__chromosome__whole_genomeMW932027.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/MA-CDC-STM-000044850/2021_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___and_ORF7b__ORF7b__genes__complete_cds__ORF8_gene__complete_sequence__and_nucleocapsid_phosphoprotein__N__and_ORF10_protein__ORF10__genes__complete_cdsMW991906.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/CA-CDC-FG-021330/2021_ORF1ab_polyprotein__ORF1ab__and_ORF1a_polyprotein__ORF1ab__genes__partial_cds__surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___and_ORF7b__ORF7b__genes__complete_cds__ORF8_gene__complete_sequence__and_nucleocapsid_phosphoprotein__N__and_ORF10_protein__ORF10__genes__complete_cdsMW931310.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/IN-CDC-STM-000045992/2021__complete_genomeOL672836.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/BEL/rega-20174/2021__complete_genomeOL677199.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/CAN/ON-NML-249359/2021_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___and_ORF7a_protein__ORF7a__genes__complete_cds__ORF7b_gene__complete_sequence__and_ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cdsOL698718.1_Severe_acute_respiratory_syndrome_coronavirus_2_isolate_SARS-CoV-2/human/USA/MN-MDH-18236/2021_ORF1ab_polyprotein__ORF1ab___ORF1a_polyprotein__ORF1ab___surface_glycoprotein__S___ORF3a_protein__ORF3a___envelope_protein__E___membrane_glycoprotein__M___ORF6_protein__ORF6___ORF7a_protein__ORF7a___ORF7b__ORF7b___ORF8_protein__ORF8___nucleocapsid_phosphoprotein__N___and_ORF10_protein__ORF10__genes__complete_cds
" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "#draw the tree\n", "rtre = tre.root(wildcard=\"OL\")\n", @@ -1288,19 +316,19 @@ }, { "cell_type": "markdown", - "id": "88457512", "metadata": {}, "source": [ - "And that is all! You now know how to run workflows in notebooks in Cloud Lab" + "## Conclusions\n", + "Here you learned how to use Azure ML Studio to conduct a basic phylogenetic analysis" ] }, { - "cell_type": "code", - "execution_count": null, - "id": "e417cb1a", + "cell_type": "markdown", "metadata": {}, - "outputs": [], - "source": [] + "source": [ + "## Clean Up\n", + "Make sure you stop your compute instance and if desired, delete the resource group associated with this tutorial." + ] } ], "metadata": { From a25581ffd97dfa60f7fd44502335334f8fc5fe7e Mon Sep 17 00:00:00 2001 From: Kyle O'Connell Date: Thu, 7 Mar 2024 16:17:45 -0500 Subject: [PATCH 09/25] updated notebook formatting --- notebooks/SRADownload/SRA-Download.ipynb | 212 +-- .../SpleenSeg_Pretrained-4_27.ipynb | 1196 ++--------------- .../RNAseq_pipeline.ipynb | 495 +++---- 3 files changed, 382 insertions(+), 1521 deletions(-) diff --git a/notebooks/SRADownload/SRA-Download.ipynb b/notebooks/SRADownload/SRA-Download.ipynb index aad19bb..963e317 100644 --- a/notebooks/SRADownload/SRA-Download.ipynb +++ b/notebooks/SRADownload/SRA-Download.ipynb @@ -18,12 +18,30 @@ "DNA sequence data are typically deposited into the NCBI Sequence Read Archive, and can be accessed through the SRA website, or via a collection of command line tools called SRA Toolkit. Individual sequence entries are assigned an Accession ID, which can be used to find and download a particular file. For example, if you go to the [SRA database](https://www.ncbi.nlm.nih.gov/sra) in a browser window, and search for `SRX15695630`, you should see an entry for _C. elegans_. Alternatively, you can search the SRA metadata using Amazon Athena and generate a list of accession numbers. Here we are going to generate a list of accessions using Athena, use tools from the SRA Toolkit to download a few fastq files, then copy those fastq files to a cloud bucket. We really only scratch the surface of how to search Athena using SQL. If you want more examples, you can also try the notebooks from [this SRA GitHub repo](https://github.com/ncbi/ASHG-Workshop-2021). " ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Learning objectives\n", + "+ Learn how to set up an Athena Database\n", + "+ Learn how to use AWS Glue to scrape the SRA metadata\n", + "+ Query Athena to find target Accession numbers\n", + "+ Use SRA tools to download genomic sequence data" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Get started" + ] + }, { "cell_type": "markdown", "id": "39f62f42", "metadata": {}, "source": [ - "### 1) Set up your Athena Database\n", + "### Set up your Athena Database\n", "You need to set up your Athena database in the Athena console before you start this notebook. Follow our [guide](https://github.com/STRIDES/NIHCloudLabAWS/blob/main/docs/create_athena_database.md) to walk you through it." ] }, @@ -32,7 +50,7 @@ "id": "7aed7098", "metadata": {}, "source": [ - "### 2) Install Dependencies" + "### Install packages\n" ] }, { @@ -99,7 +117,7 @@ "id": "ddc46609", "metadata": {}, "source": [ - "### 3) Setup Directory Structure and Create a Staging Bucket" + "### Setup Directory Structure and Create a Staging Bucket" ] }, { @@ -114,18 +132,10 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": null, "id": "827f2447", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "/home/ec2-user/SageMaker/NIHCloudLabAWS/tutorials/notebooks/SRADownload/data\n" - ] - } - ], + "outputs": [], "source": [ "cd data/" ] @@ -158,7 +168,7 @@ "id": "086a50c1", "metadata": {}, "source": [ - "### 4) Create Accession List using Athena" + "### Create Accession List using Athena" ] }, { @@ -327,7 +337,7 @@ "id": "01437b57", "metadata": {}, "source": [ - "### 5) Download FASTQ files with fasterq dump" + "### Download FASTQ files with fasterq dump" ] }, { @@ -340,18 +350,10 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": null, "id": "4764f355", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "/home/ec2-user/SageMaker/NIHCloudLabAWS/tutorials/notebooks/SRADownload/data/fasterqdump\n" - ] - } - ], + "outputs": [], "source": [ "cd fasterqdump/" ] @@ -366,30 +368,10 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": null, "id": "80c2e3b4", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "spots read : 2,054,166\n", - "reads read : 4,108,332\n", - "reads written : 4,108,332\n", - "spots read : 25,734,849\n", - "reads read : 51,469,698\n", - "reads written : 25,734,849\n", - "reads 0-length : 25,734,849\n", - "spots read : 18,624,005\n", - "reads read : 37,248,010\n", - "reads written : 18,624,005\n", - "reads 0-length : 18,624,005\n", - "CPU times: user 6.18 s, sys: 1.26 s, total: 7.44 s\n", - "Wall time: 6min 36s\n" - ] - } - ], + "outputs": [], "source": [ "%%time\n", "!for x in `cat ../list_of_accessionIDS.txt`; do fasterq-dump -f -O raw_fastq -e 8 -m 4G $x ; done" @@ -408,7 +390,7 @@ "id": "55bd52cd", "metadata": {}, "source": [ - "### 6) Download FASTQ files with prefetch + fasterq dump" + "### Download FASTQ files with prefetch + fasterq dump" ] }, { @@ -421,57 +403,20 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": null, "id": "ddefec2d", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "/home/ec2-user/SageMaker/NIHCloudLabAWS/tutorials/notebooks/SRADownload/data/prefetch_fasterqdump\n" - ] - } - ], + "outputs": [], "source": [ "cd ../prefetch_fasterqdump" ] }, { "cell_type": "code", - "execution_count": 10, + "execution_count": null, "id": "935f6ca2", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "2022-08-30T15:45:12 prefetch.2.11.0: 1) Downloading 'SRR3617061'...\n", - "2022-08-30T15:45:12 prefetch.2.11.0: Downloading via HTTPS...\n", - "2022-08-30T15:45:16 prefetch.2.11.0: HTTPS download succeed\n", - "2022-08-30T15:45:17 prefetch.2.11.0: 'SRR3617061' is valid\n", - "2022-08-30T15:45:17 prefetch.2.11.0: 1) 'SRR3617061' was downloaded successfully\n", - "\n", - "2022-08-30T15:45:17 prefetch.2.11.0: 2) Downloading 'SRR8435254'...\n", - "2022-08-30T15:45:17 prefetch.2.11.0: Downloading via HTTPS...\n", - "2022-08-30T15:45:23 prefetch.2.11.0: HTTPS download succeed\n", - "2022-08-30T15:45:24 prefetch.2.11.0: 'SRR8435254' is valid\n", - "2022-08-30T15:45:24 prefetch.2.11.0: 2) 'SRR8435254' was downloaded successfully\n", - "2022-08-30T15:45:24 prefetch.2.11.0: 'SRR8435254' has 0 dependencies\n", - "\n", - "2022-08-30T15:45:24 prefetch.2.11.0: 3) Downloading 'SRR8435252'...\n", - "2022-08-30T15:45:24 prefetch.2.11.0: Downloading via HTTPS...\n", - "2022-08-30T15:45:28 prefetch.2.11.0: HTTPS download succeed\n", - "2022-08-30T15:45:29 prefetch.2.11.0: 'SRR8435252' is valid\n", - "2022-08-30T15:45:29 prefetch.2.11.0: 3) 'SRR8435252' was downloaded successfully\n", - "2022-08-30T15:45:29 prefetch.2.11.0: 'SRR8435252' has 0 dependencies\n", - "CPU times: user 290 ms, sys: 37.5 ms, total: 327 ms\n", - "Wall time: 17 s\n" - ] - } - ], + "outputs": [], "source": [ "%%time\n", "!prefetch --option-file ../list_of_accessionIDS.txt -O raw_fastq -f yes" @@ -479,18 +424,10 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": null, "id": "7eece75e", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\u001b[0m\u001b[01;34mSRR3617061\u001b[0m/ \u001b[01;34mSRR8435252\u001b[0m/ \u001b[01;34mSRR8435254\u001b[0m/\n" - ] - } - ], + "outputs": [], "source": [ "ls raw_fastq/" ] @@ -505,30 +442,10 @@ }, { "cell_type": "code", - "execution_count": 24, + "execution_count": null, "id": "1852a71a", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "spots read : 2,054,166\n", - "reads read : 4,108,332\n", - "reads written : 4,108,332\n", - "spots read : 25,734,849\n", - "reads read : 51,469,698\n", - "reads written : 25,734,849\n", - "reads 0-length : 25,734,849\n", - "spots read : 18,624,005\n", - "reads read : 37,248,010\n", - "reads written : 18,624,005\n", - "reads 0-length : 18,624,005\n", - "CPU times: user 1.49 s, sys: 308 ms, total: 1.8 s\n", - "Wall time: 1min 38s\n" - ] - } - ], + "outputs": [], "source": [ "%%time\n", "!for x in `cat ../list_of_accessionIDS.txt`; do fasterq-dump -f -O raw_fastq -e 8 -m 4G raw_fastq/$x; done" @@ -547,7 +464,7 @@ "id": "ea152fd7", "metadata": {}, "source": [ - "### Step 7) Copy Files to a Bucket" + "### Copy Files to a Bucket" ] }, { @@ -560,60 +477,45 @@ }, { "cell_type": "code", - "execution_count": 22, + "execution_count": null, "id": "ad73308f", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "upload: raw_fastq/SRR3617061/SRR3617061.sra to s3://sra-data-athena/raw_fastq/SRR3617061/SRR3617061.sra\n", - "upload: raw_fastq/SRR8435252/SRR8435252.sra to s3://sra-data-athena/raw_fastq/SRR8435252/SRR8435252.sra\n", - "upload: raw_fastq/SRR3617061_2.fastq to s3://sra-data-athena/raw_fastq/SRR3617061_2.fastq\n", - "upload: raw_fastq/SRR3617061_1.fastq to s3://sra-data-athena/raw_fastq/SRR3617061_1.fastq\n", - "upload: raw_fastq/SRR8435254/SRR8435254.sra to s3://sra-data-athena/raw_fastq/SRR8435254/SRR8435254.sra\n", - "upload: raw_fastq/SRR8435252.fastq to s3://sra-data-athena/raw_fastq/SRR8435252.fastq\n", - "upload: raw_fastq/SRR8435254.fastq to s3://sra-data-athena/raw_fastq/SRR8435254.fastq\n" - ] - } - ], + "outputs": [], "source": [ "!aws s3 cp raw_fastq/*.fastq s3://sra-data-athena/raw_fastq --recursive" ] }, { "cell_type": "code", - "execution_count": 25, + "execution_count": null, "id": "072ebc9a", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - " PRE SRR3617061/\n", - " PRE SRR8435252/\n", - " PRE SRR8435254/\n", - "2022-08-30 15:53:41 722868342 SRR3617061_1.fastq\n", - "2022-08-30 15:53:41 722868342 SRR3617061_2.fastq\n", - "2022-08-30 15:53:42 3903844648 SRR8435252.fastq\n", - "2022-08-30 15:53:56 5411343576 SRR8435254.fastq\n" - ] - } - ], + "outputs": [], "source": [ "!aws s3 ls s3://sra-data-athena/raw_fastq/" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Conclusions\n", + "You learned here how to bring the SRA metadata into Athena and query Athena DB to find target accession numbers, then use SRA tools to download sequence data locally." + ] + }, { "cell_type": "markdown", "id": "a4026566", "metadata": {}, "source": [ - "### Step 8) Clean up\n", + "## Clean up\n", "Make sure you shut down this VM, or delete it if you don't plan to use if further. You can also [delete the buckets](https://docs.aws.amazon.com/AmazonS3/latest/userguide/delete-bucket.html) if you don't want to pay for the data: `aws s3 rb s3://bucket-name --force`" ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [] } ], "metadata": { diff --git a/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb b/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb index 48b8141..e6b6117 100644 --- a/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb +++ b/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb @@ -5,9 +5,45 @@ "id": "1452463e", "metadata": {}, "source": [ - "## Spleen Model With NVIDIA Pretrain\n", - "- Uses Unet architecture\n", - "- Pretrained model at: https://ngc.nvidia.com/catalog/models/nvidia:med:clara_pt_spleen_ct_segmentation" + "# Spleen Model With NVIDIA Pretrain" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Overview\n", + "This notebook conducts image segmentation of spleen images using an NVIDIA pretrained model. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Prerequisites\n", + "We assume you have provisioned a compute environment in Azure ML Studio **with a GPU**! A T4 GPU will work fine." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Learning objectives\n", + "+ Learn how to use NVIDIA pre-trained models for image segmentation within Azure ML Studio" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Get started" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Install packages" ] }, { @@ -20,7 +56,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": null, "id": "82db674f", "metadata": {}, "outputs": [], @@ -31,7 +67,7 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": null, "id": "bb1228b3", "metadata": {}, "outputs": [], @@ -41,7 +77,7 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": null, "id": "540e5d47", "metadata": {}, "outputs": [], @@ -65,42 +101,10 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": null, "id": "07510582", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "MONAI version: 0.8.1\n", - "Numpy version: 1.21.1\n", - "Pytorch version: 1.9.0\n", - "MONAI flags: HAS_EXT = False, USE_COMPILED = False\n", - "MONAI rev id: 71ff399a3ea07aef667b23653620a290364095b1\n", - "\n", - "Optional dependencies:\n", - "Pytorch Ignite version: 0.4.8\n", - "Nibabel version: 3.2.1\n", - "scikit-image version: 0.18.2\n", - "Pillow version: 8.3.1\n", - "Tensorboard version: 2.5.0\n", - "gdown version: 3.13.0\n", - "TorchVision version: 0.10.0+cu111\n", - "tqdm version: 4.61.2\n", - "lmdb version: 1.2.1\n", - "psutil version: 5.8.0\n", - "pandas version: 1.3.0\n", - "einops version: 0.3.0\n", - "transformers version: 4.18.0\n", - "mlflow version: 1.25.1\n", - "\n", - "For details about installing the optional dependencies, please visit:\n", - " https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies\n", - "\n" - ] - } - ], + "outputs": [], "source": [ "import os\n", "import tempfile\n", @@ -157,7 +161,7 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": null, "id": "0be7401d", "metadata": {}, "outputs": [], @@ -175,18 +179,10 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": null, "id": "311c3282", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "monai_data/\n" - ] - } - ], + "outputs": [], "source": [ "directory = \"monai_data/\"\n", "root_dir = tempfile.mkdtemp() if directory is None else directory\n", @@ -203,20 +199,10 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": null, "id": "da7cfede", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "2022-04-27 14:49:41,401 - INFO - Verified 'Task09_Spleen.tar', md5: 410d4a301da4e5b2f6f86ec3ddba524e.\n", - "2022-04-27 14:49:41,402 - INFO - File exists: monai_data/Task09_Spleen.tar, skipped downloading.\n", - "2022-04-27 14:49:41,403 - INFO - Non-empty folder exists in monai_data/Task09_Spleen, skipped extracting.\n" - ] - } - ], + "outputs": [], "source": [ "resource = \"https://msd-for-monai.s3-us-west-2.amazonaws.com/Task09_Spleen.tar\"\n", "md5 = \"410d4a301da4e5b2f6f86ec3ddba524e\"\n", @@ -236,7 +222,7 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": null, "id": "2515b177", "metadata": {}, "outputs": [], @@ -262,7 +248,7 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": null, "id": "2357d35d", "metadata": {}, "outputs": [], @@ -328,38 +314,10 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": null, "id": "ada5757a", "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[{'image': 'monai_data/Task09_Spleen/imagesTr/spleen_56.nii.gz',\n", - " 'label': 'monai_data/Task09_Spleen/labelsTr/spleen_56.nii.gz'},\n", - " {'image': 'monai_data/Task09_Spleen/imagesTr/spleen_59.nii.gz',\n", - " 'label': 'monai_data/Task09_Spleen/labelsTr/spleen_59.nii.gz'},\n", - " {'image': 'monai_data/Task09_Spleen/imagesTr/spleen_6.nii.gz',\n", - " 'label': 'monai_data/Task09_Spleen/labelsTr/spleen_6.nii.gz'},\n", - " {'image': 'monai_data/Task09_Spleen/imagesTr/spleen_60.nii.gz',\n", - " 'label': 'monai_data/Task09_Spleen/labelsTr/spleen_60.nii.gz'},\n", - " {'image': 'monai_data/Task09_Spleen/imagesTr/spleen_61.nii.gz',\n", - " 'label': 'monai_data/Task09_Spleen/labelsTr/spleen_61.nii.gz'},\n", - " {'image': 'monai_data/Task09_Spleen/imagesTr/spleen_62.nii.gz',\n", - " 'label': 'monai_data/Task09_Spleen/labelsTr/spleen_62.nii.gz'},\n", - " {'image': 'monai_data/Task09_Spleen/imagesTr/spleen_63.nii.gz',\n", - " 'label': 'monai_data/Task09_Spleen/labelsTr/spleen_63.nii.gz'},\n", - " {'image': 'monai_data/Task09_Spleen/imagesTr/spleen_8.nii.gz',\n", - " 'label': 'monai_data/Task09_Spleen/labelsTr/spleen_8.nii.gz'},\n", - " {'image': 'monai_data/Task09_Spleen/imagesTr/spleen_9.nii.gz',\n", - " 'label': 'monai_data/Task09_Spleen/labelsTr/spleen_9.nii.gz'}]" - ] - }, - "execution_count": 10, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "val_files" ] @@ -374,30 +332,10 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": null, "id": "689eea4e", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "image shape: torch.Size([239, 239, 113]), label shape: torch.Size([239, 239, 113])\n" - ] - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "check_ds = Dataset(data=val_files, transform=val_transforms)\n", "check_loader = DataLoader(check_ds, batch_size=1)\n", @@ -427,74 +365,10 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": null, "id": "fe3285d0", "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "100%|██████████| 32/32 [00:00<00:00, 57113.93it/s]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Accessing lmdb file: /home/jupyter/covid19det-kaggle/kaggle/MonaiTesting/monai_data/monai_cache.lmdb.\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "100%|██████████| 32/32 [00:00<00:00, 47679.48it/s]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "{'map_addr': 0, 'map_size': 1099511627776, 'last_pgno': 941102, 'last_txnid': 100, 'max_readers': 126, 'num_readers': 0, 'size': 32, 'filename': '/home/jupyter/covid19det-kaggle/kaggle/MonaiTesting/monai_data/monai_cache.lmdb'}\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "100%|██████████| 9/9 [00:00<00:00, 10999.05it/s]\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Accessing lmdb file: /home/jupyter/covid19det-kaggle/kaggle/MonaiTesting/monai_data/monai_cache.lmdb.\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "100%|██████████| 9/9 [00:00<00:00, 17739.07it/s]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "{'map_addr': 0, 'map_size': 1099511627776, 'last_pgno': 941102, 'last_txnid': 100, 'max_readers': 126, 'num_readers': 0, 'size': 9, 'filename': '/home/jupyter/covid19det-kaggle/kaggle/MonaiTesting/monai_data/monai_cache.lmdb'}\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "\n" - ] - } - ], + "outputs": [], "source": [ "train_ds = LMDBDataset(data=train_files, transform=train_transforms, cache_dir=root_dir)\n", "# initialize cache and print meta information\n", @@ -513,18 +387,10 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": null, "id": "455cbcdc", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "{'map_addr': 0, 'map_size': 1099511627776, 'last_pgno': 941102, 'last_txnid': 100, 'max_readers': 126, 'num_readers': 0, 'size': 32, 'filename': '/home/jupyter/covid19det-kaggle/kaggle/MonaiTesting/monai_data/monai_cache.lmdb'}\n" - ] - } - ], + "outputs": [], "source": [ "print(train_ds.info())" ] @@ -539,7 +405,7 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": null, "id": "8539fb7d", "metadata": {}, "outputs": [], @@ -558,50 +424,20 @@ }, { "cell_type": "code", - "execution_count": 15, + "execution_count": null, "id": "de7fb262", "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "'clara_pt_spleen_ct_segmentation'" - ] - }, - "execution_count": 15, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "mmar['name']" ] }, { "cell_type": "code", - "execution_count": 16, + "execution_count": null, "id": "bf96f9f9", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "using a pretrained model.\n", - "2022-04-27 14:49:45,704 - INFO - Expected md5 is None, skip md5 check for file monai_data/clara_pt_spleen_ct_segmentation_2.zip.\n", - "2022-04-27 14:49:45,705 - INFO - File exists: monai_data/clara_pt_spleen_ct_segmentation_2.zip, skipped downloading.\n", - "2022-04-27 14:49:45,706 - INFO - Non-empty folder exists in monai_data/clara_pt_spleen_ct_segmentation, skipped extracting.\n", - "2022-04-27 14:49:45,707 - INFO - \n", - "*** \"clara_pt_spleen_ct_segmentation\" available at monai_data/clara_pt_spleen_ct_segmentation.\n", - "2022-04-27 14:49:49,353 - INFO - *** Model: \n", - "2022-04-27 14:49:49,400 - INFO - *** Model params: {'dimensions': 3, 'in_channels': 1, 'out_channels': 2, 'channels': [16, 32, 64, 128, 256], 'strides': [2, 2, 2, 2], 'num_res_units': 2, 'norm': 'batch'}\n", - "2022-04-27 14:49:49,411 - INFO - \n", - "---\n", - "2022-04-27 14:49:49,412 - INFO - For more information, please visit https://ngc.nvidia.com/catalog/models/nvidia:med:clara_pt_spleen_ct_segmentation\n", - "\n" - ] - } - ], + "outputs": [], "source": [ "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\") #torch.device(\"cpu\")\n", "if PRETRAINED:\n", @@ -646,32 +482,10 @@ }, { "cell_type": "code", - "execution_count": 17, + "execution_count": null, "id": "4be7eb8f", "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "100%|██████████| 1/1 [00:00<00:00, 4639.72it/s]" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Accessing lmdb file: /home/jupyter/covid19det-kaggle/kaggle/MonaiTesting/monai_data/monai_cache.lmdb.\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "\n" - ] - } - ], + "outputs": [], "source": [ "test_file = data_dicts[20:21]\n", "test_ds = LMDBDataset(data=test_file, transform=None, cache_dir=root_dir)" @@ -687,7 +501,7 @@ }, { "cell_type": "code", - "execution_count": 18, + "execution_count": null, "id": "16fd4e94", "metadata": {}, "outputs": [], @@ -712,33 +526,10 @@ }, { "cell_type": "code", - "execution_count": 19, + "execution_count": null, "id": "9782ec96", "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 19, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "fig = plt.figure(frameon=False, figsize=(7,7))\n", "plt.title('Actual Spleen')\n", @@ -747,33 +538,10 @@ }, { "cell_type": "code", - "execution_count": 20, + "execution_count": null, "id": "76cd38e6", "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 20, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "fig = plt.figure(frameon=False, figsize=(7,7))\n", "plt.title('Pretrained CalculatedSpleen')\n", @@ -782,33 +550,10 @@ }, { "cell_type": "code", - "execution_count": 21, + "execution_count": null, "id": "65c68242", "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 21, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAVcAAAGrCAYAAAB0YdR6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAg+klEQVR4nO3de5RedX3v8fc3k8nkMjkJuRByaxIgcEykRTsHWhGhB4U0XsDlpaHHNrRatIqWgi0YDzZacgCpyupSoHBQqCCQHlFja6tUkYtdctMYLuGSkEhCYkICBEKAZDK/88feE54kM5mZzPzmeZ6Z92utWbOf37599549n/nty/NMpJSQJPWtIdUuQJIGIsNVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXA9SRFwdERdVvP7LiNgUEdsjYnxEnBART5avz6hiqapjEXF9RFzcT+tKEXFkf6yrp+uMiJMjYn1/1NRXDNcORMTaiHglIl6KiBci4r8i4mMRsWd/pZQ+llL6+3L6RuDLwKkppeaU0lbgC8BXy9ffrcqGZBIRiyNiV/mHY3tErIyI9/Vg/rUR8facNfaFKDwVEY/2YJ7FEXFjzrqqISJ+Wgbh7+zT/t2y/eTqVFa7DNfOvTulNBqYAVwKXABc18m0k4DhwCMVbTP2ed1tETH0YObrZ7eWfziagXOBGyNiUpVr6mtvAw4FDo+I/1HtYmrAE8Cftr+IiPHA7wHPVq2iGma4diGltC2ltAz4I2BhRLwRXj9di4ijgMfLyV+IiJ9ExGrgcOD7Zc+uKSLGRMR1EbExIp4p520ol3VWRPwsIr4SEc8Bi8t5/iEini4vN1wdESPK6U+OiPURcX5EbC6X+WftNUfEiIj4UkT8OiK2RcQ9FfP+XtkTfyEiflXZ4yjreKrssa+JiP/VzX30Q+Al4IiKZb0rIpZX9Px/u2z/JvBbFfvmbyPihog4vxw/tewJfbx8fWREPBcRcaDlluOmRMS3I+LZsv5PVYxbHBFLI+Kfy+17JCJauti0hcD3gB+Uw3tExNyIuL2sbVNELIqIecAi4I/KbftVOe1ePfV9e7cR8S8R8ZvyZ3VXRMztzn6PiCPK421rRGyJiJsiYmzF+LUR8emIWFEu+9aIGF4x/m/KY2dDRPx5N1Z5U7ltDeXrM4HvADsrltkUEVeUy9xQDjd1Z50HOubrUkrJr32+gLXA2ztofxr4y3L4euDicngmkIChnS0D+C7wT8Aoit7QfcBHy3FnAa3AJ4GhwAjgCmAZMA4YDXwfuKSc/uRy+i8AjcB8YAdwSDn+a8BPgalAA/AWoKl8vbWcfgjwjvL1xLKuF4Gjy2VMBuZ2sn8WAzeWwwG8E3gBGFu2vRnYDBxfrn9huT+aOtk3fw58vxz+Y2A1Rc+4fdz3ulpuuT0PAp8DhlH8cXsKOK2i5lfLbW8ALgF+foBjYGS5P+YD7wO2AMPKcaOBjcD5FGcso4Hj9903BzgW9pqm3MbR5XZcASyvGHc95XHWQY1Hlj/DpvJneBdwxT7rvQ+YQnEcrQQ+Vo6bB2wC3lj+7L9FcQwf2cm6fgp8BPgR8Idl233A7wPrgZPLti8AP6c4xicC/wX8fXfWSdfH/PpqZ0OPcqTaBdTi176/DBXtPwc+Ww7vOejpIlwpLhu8BoyoGH8mcEc5fBbwdMW4AF4Gjqho+31gTTl8MvDKPuvbTHGKNqQc9zsd1H8B8M192n5IEVKjKALyfZV1drJ/FlP0Vl6gCPXdwN9WjL+q/Reqou1x4KSO9i9Fj/eFsvargY+2/yIBNwDndbVcisB9ep9xnwG+UVHzf1aMmwO8coBt/BDF6e5QivB6AXhvxc/ulwfYNz0K132mHVseS2P2Pc66cdyeUVlXud4PVbz+InB1Ofx14NKKcUfRvXD9EHAzcDTwRDmuMlxXA/Mr5jsNWNvVOuneMV9X4eplgZ6ZCjx3EPPNoOhhbixPZ1+g6MUeWjHNuorhiRQ9pwcrpv+Psr3d1pRSa8XrHUAzMIGiN7W6kzo+0L7McrlvBSanlF6muPTxsbLOf4uI/36AbVqaUhqbUhpJEY5/GhEfrVjP+fusZzpFD2o/KaXVwHbgWOBE4F+BDRFxNEVw3tmN5c4ApuwzbhHFH7Z2v9lnfw2Pzq9vLyy3sTWl9BpwG69fGphOx/u3xyKiISIujYjVEfEiRSBC8XPsat5DI+KWKC4zvQjc2MF8+25zczk8hb2PuV93s+TbgP9JcZb1zQ7GT9lnWb/m9Z/7gdbZnWO+rtTDjZOaEMUNjanAPQcx+zqKnuuEfQKxUuXHk22h6H3OTSk908N1baE4/T0C+FUHdXwzpfQXHRZQXDv9YXmd62LgWoqwO6CU0tqI+Hfg3RR/NNYBS1JKSzqbpYO2O4H3U5x6PxMRd1LcPDkEWF5Rf4fLjYj2Xs7srurtSkRMowiQ4+L1pyBGUoTxhLKOMzuZvaNte7mcv91hFcN/DJwOvJ0iWMcAz1P05LpySbm+304pbY3ikb+vdmM+KC5rTK94/VvdmSmltKP8Wf8lFdfYK2xg75u5v1W2dbXO3hzzNcmeaxci4r9FxLuAWyhO5R7q6TJSShsprlV9qVzekPJmxEmdTN9GEWxfiYhDyzqmRsRp3VhXG8Xp15fLGzwNEfH75U2FG4F3R8RpZfvwKG6OTYuISRHxnogYRfGHYDvF6X6XyjCax+u/UNcCH4uI46MwKiLeGRGjy/GbKK6JVroTOIfiuiEUp6GfBO5JKbXXcaDl3ge8GBEXRHFDryEi3hgHd5f/TyjujB9N0Zs+luIUdj1FqP4rcFhEnFvehBkdEcdXbNvMqHhsj+KPw4KIaIziJtr7K8aNptjfWykC+P/0oM7RFD+nFyJiKvA3PZh3KXBWRMyJiJHA3/Vg3kUUl3jWdjDuZuB/R8TE8g/R5yiOuwOuszfHfK0yXDv3/Yh4iaKX8lmK51j/7MCzHNCfUtxoeZSiZ/L/KG4adeYCYBXw8/KU7z8pftm749PAQ8D9FJcxLgOGpJTWUfSSFlFcT1xH8Qs5pPw6n6KX8RzF6fjHD7CO9jvi28v1/Az4PEBK6QHgLyh6Uc+X23FWxbyXUPwCvhARny7b7qQIi/ZwvYcibNpfH3C5ZQC/myII11D0hP4vRU+wpxYCV6aUflP5RXE9eGFK6SWKG0nvpjjtfhL4g3Lefym/b42IX5TDF1H08p4v99G3Ktb1zxSnx89QHBs/70Gdn6e4ybcN+DeKU/ZuSSn9O8UNpJ9Q7Mef9GDeDSmlzs7gLgYeAFZQHIO/KNu6s87eHPM1J8qLxZKkPmTPVZIyMFwlKYNs4RoR8yLi8YhYFREX5lqPJNWiLNdco3h73BMUF/3XU9zwODOl1O0PwJCkepbrOdfjgFUppacAIuIWirvUHYZrc3NzGj9+fKZSJCmPrVu3sn379g6fSc4VrlPZ+50Y6ynenrhHRJwNnA0wbtw4LrzQKweS6sull17a6bhc11w7SvK9rj+klK5JKbWklFqam5s7mFyS6leucF3P3m9zm8brb4GTpAEvV7jeD8yOiFkRMQxYQPFRYpI0KGS55ppSao2Icyg+zq4B+HpK6aA+lV+S6lG2T8VKKf2A4hPcJWnQ8R1akpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGQytdgFST23ZsoWvfvWrALzrXe+ipaWlyhVJ+zNcVdPWrFnDunXrOPHEE4kIAIYPH86JJ54IwMSJE6tZntQpw1VV19bWxubNm/drHz9+PFu3bmX58uV7whSgubmZU045JUstzz77LCNGjKC5uTnL8jV4GK7qVyml/dp27tzJkiVL2L17917tixYtoqWlpV9P+6+88kpOOOEE3v72t/fbOjUwGa7qV2vWrOHaa6/dq62xsZHPf/7zNDQ07NVejd7jeeedR2NjY7+vVwOP4apsVqxYwRNPPLFX27PPPsu2bdsAOPnkk5k2bRoNDQ2MHTuWIUOq//DK6NGjWblyJRs2bMh26UGDg+GqPpdSYt26ddx///08+OCDHU4zY8YMWlpaOPzww/u5uq4988wz/OxnPzNc1SuGq/pUSomUElddddWeHiqw504/wNChQzn33HNpamqqRondUlmvdDB6Fa4RsRZ4CdgNtKaUWiJiHHArMBNYC3wwpfR878pUvdi8eTNf/OIXeeWVV/a0RQRf+MIXGDFixJ62YcOGVaO8bjnppJN461vfWu0yVOf6ouf6BymlLRWvLwR+nFK6NCIuLF9f0AfrUR1IKe0J1re85S3MnTsXgDFjxjB0aH2cKDU2NnpTS72W42g/HTi5HL4B+CmG64C3evVqdu3axfPPv36SMmvWLN70pjdVsSqpenobrgn4UUQk4J9SStcAk1JKGwFSShsj4tDeFqnalVKitbWVG264gS1bihOY9h5qLdz9l6qlt+F6QkppQxmgt0fEY92dMSLOBs4GGDduXC/LULU8//zzLF68mNbWVgAmTZrEokWLAPZ7blUaTHoVrimlDeX3zRHxHeA4YFNETC57rZOB/d/XWMxzDXANwIwZM/Z/247qRmtrK2eddRZjxoxh2LBhXq+U6MVHDkbEqIgY3T4MnAo8DCwDFpaTLQS+19siVZs2bdrEypUrAZg5cyZHH300s2bNqnJVUm3oTc91EvCd8nnAocC3Ukr/ERH3A0sj4sPA08AHel+mas2rr77KY489xrJlyxgxYoTPhUr7OOhwTSk9BfxOB+1bAd/aMsBdccUVzJkzh8svvxzwoXtpX/Xx4KFqxo4dO/ja177G/PnzmTZtmk8ESJ3wN0M90tbWxpo1axg3btyAeMpjxYoVPPXUU9UuQwOQ4aoeiQjGjh1b949ZtbW1sW3bNn7yk5+watWqapejAcjLAuqRkSNHsmTJkmqX0Ws7duxg0aJFXHDBBUyfPr3a5WgAsueqbrnpppv40Y9+RETs+apnI0eO5KKLLmLy5Ml1vy2qTfZcdUCtra3cddddTJw4kalTp1a7nD4zZMgQDjvssGqXoQHMnqsOqLW1ldtuu40jjjhizydc1ZtXX32VZ599tsP/3yXlYrhqwFu1ahVLliyhra2t2qVoEPGygA6oqamJiy++uK7/1fTRRx/N4sWLfSZX/cqjTR269957uf/++/c8elUvH3TdLiI49dRTmTp1Ko2NjYwdO9YbV+pXhqv2klJi/fr13H333axYsaLa5Ry0IUOGMGHCBEaNGlXtUjRI1Vd3RFm1f/D15Zdfzq5du2hpaal2SQdt9+7dfOtb36p2GRrEDFftsXr1aq6++mp27drFRz7yEebMmVPtkqS65WUB7bF792527NgBwPDhwxk+fHiVK+qZ+fPnc+SRR1a7DAmw56oOzJ49m5EjR1a7jG4bMWIEM2fOZNSoUTX9L7s1uBiuAopea0qJ4cOH8/GPf5ympqZql9QtEcHUqVN529vexo033sjLL79c7ZIkwHBVaenSpTz//PNccsklddX7O/XUUxk/fjzf+MY32LlzZ7XLkfbwmusg19bWxvXXX8+KFSvYvXs3TU1NdfU86JAhQxgyZIjBqppjuIqHHnqIMWPG1N3TAXPnzqW1tZVNmzZVuxRpP4brINbW1sarr74KwDve8Q5OOaW+/vXZ8ccfz44dO7j99turXYq0H6+5DmLr1q3j8ssv9wNNpAwM10Hq7rvv5q677iKlxCc/+UmmTZtW7ZK6bdy4ccybN4/GxsZqlyJ1ynAdpDZt2sS2bds46aSTOPzww+vm0Sso3uAwZswYNm7cyNatW6tdjtQhw3UQGz9+PB/84AerXUaPDBs2jOHDh7Nz507uuOMOtm3bVu2SpA4ZrqorJ554ItOmTeO6666rdinSAfm0gCRlYM91kHrDG97AlClTql1Gjxx//PEAPPXUU1WuROqa4TpI1dM/G2xoaGDcuHHMmTOHX//619x9993VLknqkuGqmjdmzBje//73V7sMqUcMV9W04447bk8v+4477uDpp5+uckVS93hDSzVt2LBhDB06lEcffZSnn356z4d5S7XOcB2EpkyZUhf/KnvSpEk0NDTw4osvcueddxqsqiuG6yA0b948jjnmmGqX0aVTTz2V4cOHc+utt1a7FKnHvOaqmjNq1CgWLFjAPffcw9q1a6tdjnRQ7Lmq5uzcuZOHH36YZ555htdee63a5UgHxZ6ras6uXbu49957q12G1Cv2XCUpA8NVkjIwXFVTnnzySW666SZSStUuReoVw3UQ2rJlCy+++GKvlrF69Wp+85vf9FFFr0spsXv37j5frtTfvKE1CC1btqxb07W1tbFr1y4AGhsbGTLk9b/Ft912G3PnzmX+/Pl9WttRRx3FUUcd1afLlKrBcFWnHnnkEa655hoA/vqv/5rDDz98z7jzzjuPiKhWaVLNM1wHsU2bNnHzzTd3OO6MM85g5syZfOpTnwLgsMMO22t8Q0ND9vqkema4DmINDQ1MmDChw3GNjY2MHj2a0aNH92iZO3bs4NFHH+WYY46pq396KPU1w3UQmzBhAh/60If6dJnbt2/n1ltvZdq0aYwfP95/f61By6cF1KcmTpzIZZddxj/+4z9y3333VbscqWoMV/WpiCAiOOecc3juuee49NJLueyyy3jllVeqXZrUr7wsoD4XEUyZMoWtW7fS1tYGwJAhQ3j88cfZuXNnXXzcodRbhquyOeaYY/YK0scee4zNmzczffp0xowZ46NcGtC8LKB+8573vIeTTjqJiy66iJ07d1a7HCkrw1X9JiKYMWMGn/vc5xg2bBjvfOc7+/wdXlKtMFzVr5qampg4cSIRwYYNG9iwYUOX87zyyiv88Ic/5OWXX+6HCqW+4TVXVc0vf/nL/dpGjhzJ2LFj92rbunUr9957Ly0tLYwaNaqfqpN6x3BVTXnjG9/I7/7u7+7V9uqrr/rvXlR3DFfVlOXLl/PEE0/s1db+OJdUTwxX1ZSdO3f6JIEGBG9oSVIGXYZrRHw9IjZHxMMVbeMi4vaIeLL8fkjFuM9ExKqIeDwiTstVuCTVsu70XK8H5u3TdiHw45TSbODH5WsiYg6wAJhbznNlRPjBn5IGnS7DNaV0F/DcPs2nAzeUwzcAZ1S035JSei2ltAZYBRzXN6VKUv042Guuk1JKGwHK74eW7VOBdRXTrS/b9hMRZ0fEAxHxwPbt2w+yDEmqTX19Q6ujT+Lo8H8kp5SuSSm1pJRampub+7gMSaqugw3XTRExGaD8vrlsXw9Mr5huGtD1+xslaYA52HBdBiwshxcC36toXxARTRExC5gN+HH0kgadLt9EEBE3AycDEyJiPfB3wKXA0oj4MPA08AGAlNIjEbEUeBRoBT6RUtqdqXZJqlldhmtK6cxORp3SyfRLgCW9KUqS6p3v0JKkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDAxXScrAcJWkDLoM14j4ekRsjoiHK9oWR8QzEbG8/JpfMe4zEbEqIh6PiNNyFS5Jtaw7PdfrgXkdtH8lpXRs+fUDgIiYAywA5pbzXBkRDX1VrCTViy7DNaV0F/BcN5d3OnBLSum1lNIaYBVwXC/qk6S61JtrrudExIryssEhZdtUYF3FNOvLtv1ExNkR8UBEPLB9+/ZelCFJtedgw/Uq4AjgWGAj8KWyPTqYNnW0gJTSNSmllpRSS3Nz80GWIUm16aDCNaW0KaW0O6XUBlzL66f+64HpFZNOAzb0rkRJqj8HFa4RMbni5XuB9icJlgELIqIpImYBs4H7eleiJNWfoV1NEBE3AycDEyJiPfB3wMkRcSzFKf9a4KMAKaVHImIp8CjQCnwipbQ7S+WSVMO6DNeU0pkdNF93gOmXAEt6U5Qk1TvfoSVJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpSB4SpJGRiukpRBl+EaEdMj4o6IWBkRj0TEX5Xt4yLi9oh4svx+SMU8n4mIVRHxeESclnMDJKkWdafn2gqcn1J6A/B7wCciYg5wIfDjlNJs4Mfla8pxC4C5wDzgyohoyFG8JNWqLsM1pbQxpfSLcvglYCUwFTgduKGc7AbgjHL4dOCWlNJrKaU1wCrguD6uW5JqWo+uuUbETOBNwL3ApJTSRigCGDi0nGwqsK5itvVl277LOjsiHoiIB7Zv334QpUtS7ep2uEZEM/Bt4NyU0osHmrSDtrRfQ0rXpJRaUkotzc3N3S1DkupCt8I1IhopgvWmlNJtZfOmiJhcjp8MbC7b1wPTK2afBmzom3IlqT5052mBAK4DVqaUvlwxahmwsBxeCHyvon1BRDRFxCxgNnBf35UsSbVvaDemOQH4E+ChiFheti0CLgWWRsSHgaeBDwCklB6JiKXAoxRPGnwipbS7rwuXpFrWZbimlO6h4+uoAKd0Ms8SYEkv6pKkuuY7tCQpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpA8NVkjIwXCUpgy7DNSKmR8QdEbEyIh6JiL8q2xdHxDMRsbz8ml8xz2ciYlVEPB4Rp+XcAEmqRUO7MU0rcH5K6RcRMRp4MCJuL8d9JaX0D5UTR8QcYAEwF5gC/GdEHJVS2t2XhUtSLeuy55pS2phS+kU5/BKwEph6gFlOB25JKb2WUloDrAKO64tiJale9Oiaa0TMBN4E3Fs2nRMRKyLi6xFxSNk2FVhXMdt6OgjjiDg7Ih6IiAe2b9/e88olqYZ1O1wjohn4NnBuSulF4CrgCOBYYCPwpfZJO5g97deQ0jUppZaUUktzc3NP65akmtatcI2IRopgvSmldBtASmlTSml3SqkNuJbXT/3XA9MrZp8GbOi7kiWp9nXnaYEArgNWppS+XNE+uWKy9wIPl8PLgAUR0RQRs4DZwH19V7Ik1b7uPC1wAvAnwEMRsbxsWwScGRHHUpzyrwU+CpBSeiQilgKPUjxp8AmfFJA02HQZrimle+j4OuoPDjDPEmBJL+qSpLrmO7QkKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyMFwlKQPDVZIyiJRStWsgIp4FXga2VLuWGjAB94P7oOB+qP19MCOlNLGjETURrgAR8UBKqaXadVSb+8F90M79UN/7wMsCkpSB4SpJGdRSuF5T7QJqhPvBfdDO/VDH+6BmrrlK0kBSSz1XSRowDFdJyqDq4RoR8yLi8YhYFREXVrue/hQRayPioYhYHhEPlG3jIuL2iHiy/H5ItevsaxHx9YjYHBEPV7R1ut0R8Zny+Hg8Ik6rTtV9q5N9sDginimPh+URMb9i3IDbBwARMT0i7oiIlRHxSET8Vdle/8dDSqlqX0ADsBo4HBgG/AqYU82a+nn71wIT9mn7InBhOXwhcFm168yw3W8D3gw83NV2A3PK46IJmFUeLw3V3oZM+2Ax8OkOph2Q+6DctsnAm8vh0cAT5fbW/fFQ7Z7rccCqlNJTKaWdwC3A6VWuqdpOB24oh28AzqheKXmklO4CntunubPtPh24JaX0WkppDbCK4ripa53sg84MyH0AkFLamFL6RTn8ErASmMoAOB6qHa5TgXUVr9eXbYNFAn4UEQ9GxNll26SU0kYoDjzg0KpV17862+7BdoycExEryssG7afCg2IfRMRM4E3AvQyA46Ha4RodtA2mZ8NOSCm9GfhD4BMR8bZqF1SDBtMxchVwBHAssBH4Utk+4PdBRDQD3wbOTSm9eKBJO2iryX1R7XBdD0yveD0N2FClWvpdSmlD+X0z8B2K05tNETEZoPy+uXoV9qvOtnvQHCMppU0ppd0ppTbgWl4/3R3Q+yAiGimC9aaU0m1lc90fD9UO1/uB2RExKyKGAQuAZVWuqV9ExKiIGN0+DJwKPEyx/QvLyRYC36tOhf2us+1eBiyIiKaImAXMBu6rQn3ZtYdJ6b0UxwMM4H0QEQFcB6xMKX25YlT9Hw/VvqMGzKe4Q7ga+Gy16+nH7T6c4q7nr4BH2rcdGA/8GHiy/D6u2rVm2PabKU57d1H0RD58oO0GPlseH48Df1jt+jPug28CDwErKEJk8kDeB+V2vZXitH4FsLz8mj8Qjgff/ipJGVT7soAkDUiGqyRlYLhKUgaGqyRlYLhKUgaGqyRlYLhKUgb/H9un52S84ot/AAAAAElFTkSuQmCC\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "fig = plt.figure(frameon=False, figsize=(7,7))\n", "plt.title('Differences Between Actual and Model')\n", @@ -837,7 +582,7 @@ }, { "cell_type": "code", - "execution_count": 22, + "execution_count": null, "id": "a8ad6aee", "metadata": {}, "outputs": [], @@ -848,526 +593,10 @@ }, { "cell_type": "code", - "execution_count": 23, + "execution_count": null, "id": "d91d340c", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "----------\n", - "epoch 1/25\n", - "1/16, train_loss: 0.8680\n", - "2/16, train_loss: 0.3699\n", - "3/16, train_loss: 0.3849\n", - "4/16, train_loss: 0.1306\n", - "5/16, train_loss: 0.2781\n", - "6/16, train_loss: 0.3628\n", - "7/16, train_loss: 0.3609\n", - "8/16, train_loss: 0.1828\n", - "9/16, train_loss: 0.1493\n", - "10/16, train_loss: 0.5063\n", - "11/16, train_loss: 0.2929\n", - "12/16, train_loss: 0.2826\n", - "13/16, train_loss: 0.2017\n", - "14/16, train_loss: 0.2591\n", - "15/16, train_loss: 0.2568\n", - "16/16, train_loss: 0.2385\n", - "epoch 1 average loss: 0.3203\n", - "----------\n", - "epoch 2/25\n", - "1/16, train_loss: 0.3457\n", - "2/16, train_loss: 0.2234\n", - "3/16, train_loss: 0.3443\n", - "4/16, train_loss: 0.0816\n", - "5/16, train_loss: 0.2259\n", - "6/16, train_loss: 0.1580\n", - "7/16, train_loss: 0.2593\n", - "8/16, train_loss: 0.1651\n", - "9/16, train_loss: 0.1124\n", - "10/16, train_loss: 0.4822\n", - "11/16, train_loss: 0.2900\n", - "12/16, train_loss: 0.2571\n", - "13/16, train_loss: 0.1799\n", - "14/16, train_loss: 0.1984\n", - "15/16, train_loss: 0.2286\n", - "16/16, train_loss: 0.2216\n", - "epoch 2 average loss: 0.2359\n", - "saved new best metric model\n", - "current epoch: 2 current mean dice: 0.8615\n", - "best mean dice: 0.8615 at epoch: 2\n", - "----------\n", - "epoch 3/25\n", - "1/16, train_loss: 0.3400\n", - "2/16, train_loss: 0.2297\n", - "3/16, train_loss: 0.3453\n", - "4/16, train_loss: 0.0822\n", - "5/16, train_loss: 0.2285\n", - "6/16, train_loss: 0.1213\n", - "7/16, train_loss: 0.2370\n", - "8/16, train_loss: 0.1607\n", - "9/16, train_loss: 0.1065\n", - "10/16, train_loss: 0.4543\n", - "11/16, train_loss: 0.2848\n", - "12/16, train_loss: 0.2848\n", - "13/16, train_loss: 0.1763\n", - "14/16, train_loss: 0.1748\n", - "15/16, train_loss: 0.4361\n", - "16/16, train_loss: 0.2234\n", - "epoch 3 average loss: 0.2429\n", - "----------\n", - "epoch 4/25\n", - "1/16, train_loss: 0.3328\n", - "2/16, train_loss: 0.2447\n", - "3/16, train_loss: 0.3436\n", - "4/16, train_loss: 0.0723\n", - "5/16, train_loss: 0.2213\n", - "6/16, train_loss: 0.1676\n", - "7/16, train_loss: 0.2672\n", - "8/16, train_loss: 0.2121\n", - "9/16, train_loss: 0.1122\n", - "10/16, train_loss: 0.5265\n", - "11/16, train_loss: 0.2810\n", - "12/16, train_loss: 0.2688\n", - "13/16, train_loss: 0.1795\n", - "14/16, train_loss: 0.1853\n", - "15/16, train_loss: 0.2458\n", - "16/16, train_loss: 0.2314\n", - "epoch 4 average loss: 0.2433\n", - "saved new best metric model\n", - "current epoch: 4 current mean dice: 0.8744\n", - "best mean dice: 0.8744 at epoch: 4\n", - "----------\n", - "epoch 5/25\n", - "1/16, train_loss: 0.3378\n", - "2/16, train_loss: 0.2047\n", - "3/16, train_loss: 0.3350\n", - "4/16, train_loss: 0.0583\n", - "5/16, train_loss: 0.2161\n", - "6/16, train_loss: 0.1008\n", - "7/16, train_loss: 0.2325\n", - "8/16, train_loss: 0.1629\n", - "9/16, train_loss: 0.1037\n", - "10/16, train_loss: 0.4499\n", - "11/16, train_loss: 0.2763\n", - "12/16, train_loss: 0.2321\n", - "13/16, train_loss: 0.1702\n", - "14/16, train_loss: 0.1652\n", - "15/16, train_loss: 0.2206\n", - "16/16, train_loss: 0.2169\n", - "epoch 5 average loss: 0.2177\n", - "----------\n", - "epoch 6/25\n", - "1/16, train_loss: 0.3303\n", - "2/16, train_loss: 0.1888\n", - "3/16, train_loss: 0.3331\n", - "4/16, train_loss: 0.0535\n", - "5/16, train_loss: 0.2149\n", - "6/16, train_loss: 0.0962\n", - "7/16, train_loss: 0.2267\n", - "8/16, train_loss: 0.1555\n", - "9/16, train_loss: 0.0995\n", - "10/16, train_loss: 0.4476\n", - "11/16, train_loss: 0.2751\n", - "12/16, train_loss: 0.2215\n", - "13/16, train_loss: 0.1644\n", - "14/16, train_loss: 0.1603\n", - "15/16, train_loss: 0.2159\n", - "16/16, train_loss: 0.2141\n", - "epoch 6 average loss: 0.2123\n", - "saved new best metric model\n", - "current epoch: 6 current mean dice: 0.8952\n", - "best mean dice: 0.8952 at epoch: 6\n", - "----------\n", - "epoch 7/25\n", - "1/16, train_loss: 0.3286\n", - "2/16, train_loss: 0.1815\n", - "3/16, train_loss: 0.3317\n", - "4/16, train_loss: 0.0487\n", - "5/16, train_loss: 0.2127\n", - "6/16, train_loss: 0.0926\n", - "7/16, train_loss: 0.2236\n", - "8/16, train_loss: 0.1536\n", - "9/16, train_loss: 0.0955\n", - "10/16, train_loss: 0.4468\n", - "11/16, train_loss: 0.2730\n", - "12/16, train_loss: 0.2171\n", - "13/16, train_loss: 0.1616\n", - "14/16, train_loss: 0.1565\n", - "15/16, train_loss: 0.2147\n", - "16/16, train_loss: 0.2123\n", - "epoch 7 average loss: 0.2094\n", - "----------\n", - "epoch 8/25\n", - "1/16, train_loss: 0.3276\n", - "2/16, train_loss: 0.1800\n", - "3/16, train_loss: 0.3311\n", - "4/16, train_loss: 0.0459\n", - "5/16, train_loss: 0.2114\n", - "6/16, train_loss: 0.0853\n", - "7/16, train_loss: 0.2206\n", - "8/16, train_loss: 0.1529\n", - "9/16, train_loss: 0.0939\n", - "10/16, train_loss: 0.4467\n", - "11/16, train_loss: 0.2725\n", - "12/16, train_loss: 0.2171\n", - "13/16, train_loss: 0.1600\n", - "14/16, train_loss: 0.1502\n", - "15/16, train_loss: 0.2140\n", - "16/16, train_loss: 0.2115\n", - "epoch 8 average loss: 0.2075\n", - "saved new best metric model\n", - "current epoch: 8 current mean dice: 0.8957\n", - "best mean dice: 0.8957 at epoch: 8\n", - "----------\n", - "epoch 9/25\n", - "1/16, train_loss: 0.3275\n", - "2/16, train_loss: 0.1822\n", - "3/16, train_loss: 0.3309\n", - "4/16, train_loss: 0.0455\n", - "5/16, train_loss: 0.2110\n", - "6/16, train_loss: 0.0818\n", - "7/16, train_loss: 0.2194\n", - "8/16, train_loss: 0.1520\n", - "9/16, train_loss: 0.0917\n", - "10/16, train_loss: 0.4467\n", - "11/16, train_loss: 0.2723\n", - "12/16, train_loss: 0.2165\n", - "13/16, train_loss: 0.1593\n", - "14/16, train_loss: 0.1236\n", - "15/16, train_loss: 0.2136\n", - "16/16, train_loss: 0.2107\n", - "epoch 9 average loss: 0.2053\n", - "----------\n", - "epoch 10/25\n", - "1/16, train_loss: 0.3271\n", - "2/16, train_loss: 0.1726\n", - "3/16, train_loss: 0.3308\n", - "4/16, train_loss: 0.0439\n", - "5/16, train_loss: 0.2106\n", - "6/16, train_loss: 0.0886\n", - "7/16, train_loss: 0.2209\n", - "8/16, train_loss: 0.1518\n", - "9/16, train_loss: 0.0860\n", - "10/16, train_loss: 0.4452\n", - "11/16, train_loss: 0.2715\n", - "12/16, train_loss: 0.2150\n", - "13/16, train_loss: 0.1589\n", - "14/16, train_loss: 0.1150\n", - "15/16, train_loss: 0.2142\n", - "16/16, train_loss: 0.2095\n", - "epoch 10 average loss: 0.2038\n", - "saved new best metric model\n", - "current epoch: 10 current mean dice: 0.8958\n", - "best mean dice: 0.8958 at epoch: 10\n", - "----------\n", - "epoch 11/25\n", - "1/16, train_loss: 0.3271\n", - "2/16, train_loss: 0.1735\n", - "3/16, train_loss: 0.3314\n", - "4/16, train_loss: 0.0430\n", - "5/16, train_loss: 0.2099\n", - "6/16, train_loss: 0.0801\n", - "7/16, train_loss: 0.2201\n", - "8/16, train_loss: 0.1508\n", - "9/16, train_loss: 0.0721\n", - "10/16, train_loss: 0.4451\n", - "11/16, train_loss: 0.2714\n", - "12/16, train_loss: 0.2155\n", - "13/16, train_loss: 0.1592\n", - "14/16, train_loss: 0.1247\n", - "15/16, train_loss: 0.2139\n", - "16/16, train_loss: 0.2107\n", - "epoch 11 average loss: 0.2030\n", - "----------\n", - "epoch 12/25\n", - "1/16, train_loss: 0.3268\n", - "2/16, train_loss: 0.1712\n", - "3/16, train_loss: 0.3305\n", - "4/16, train_loss: 0.0453\n", - "5/16, train_loss: 0.2103\n", - "6/16, train_loss: 0.0783\n", - "7/16, train_loss: 0.2179\n", - "8/16, train_loss: 0.1529\n", - "9/16, train_loss: 0.0912\n", - "10/16, train_loss: 0.4469\n", - "11/16, train_loss: 0.2724\n", - "12/16, train_loss: 0.2162\n", - "13/16, train_loss: 0.1588\n", - "14/16, train_loss: 0.1072\n", - "15/16, train_loss: 0.2129\n", - "16/16, train_loss: 0.2091\n", - "epoch 12 average loss: 0.2030\n", - "saved new best metric model\n", - "current epoch: 12 current mean dice: 0.9008\n", - "best mean dice: 0.9008 at epoch: 12\n", - "----------\n", - "epoch 13/25\n", - "1/16, train_loss: 0.3266\n", - "2/16, train_loss: 0.1666\n", - "3/16, train_loss: 0.3304\n", - "4/16, train_loss: 0.0419\n", - "5/16, train_loss: 0.2105\n", - "6/16, train_loss: 0.0826\n", - "7/16, train_loss: 0.2195\n", - "8/16, train_loss: 0.1506\n", - "9/16, train_loss: 0.0553\n", - "10/16, train_loss: 0.4447\n", - "11/16, train_loss: 0.2715\n", - "12/16, train_loss: 0.2125\n", - "13/16, train_loss: 0.1575\n", - "14/16, train_loss: 0.1083\n", - "15/16, train_loss: 0.2135\n", - "16/16, train_loss: 0.2085\n", - "epoch 13 average loss: 0.2000\n", - "----------\n", - "epoch 14/25\n", - "1/16, train_loss: 0.3270\n", - "2/16, train_loss: 0.1647\n", - "3/16, train_loss: 0.3316\n", - "4/16, train_loss: 0.0405\n", - "5/16, train_loss: 0.2091\n", - "6/16, train_loss: 0.0686\n", - "7/16, train_loss: 0.2185\n", - "8/16, train_loss: 0.1499\n", - "9/16, train_loss: 0.0482\n", - "10/16, train_loss: 0.4443\n", - "11/16, train_loss: 0.2708\n", - "12/16, train_loss: 0.2106\n", - "13/16, train_loss: 0.1568\n", - "14/16, train_loss: 0.1043\n", - "15/16, train_loss: 0.2121\n", - "16/16, train_loss: 0.2079\n", - "epoch 14 average loss: 0.1978\n", - "saved new best metric model\n", - "current epoch: 14 current mean dice: 0.9015\n", - "best mean dice: 0.9015 at epoch: 14\n", - "----------\n", - "epoch 15/25\n", - "1/16, train_loss: 0.3259\n", - "2/16, train_loss: 0.1630\n", - "3/16, train_loss: 0.3303\n", - "4/16, train_loss: 0.0399\n", - "5/16, train_loss: 0.2085\n", - "6/16, train_loss: 0.0579\n", - "7/16, train_loss: 0.2165\n", - "8/16, train_loss: 0.1509\n", - "9/16, train_loss: 0.0487\n", - "10/16, train_loss: 0.4449\n", - "11/16, train_loss: 0.2704\n", - "12/16, train_loss: 0.2090\n", - "13/16, train_loss: 0.1557\n", - "14/16, train_loss: 0.1021\n", - "15/16, train_loss: 0.2118\n", - "16/16, train_loss: 0.2084\n", - "epoch 15 average loss: 0.1965\n", - "----------\n", - "epoch 16/25\n", - "1/16, train_loss: 0.3258\n", - "2/16, train_loss: 0.1620\n", - "3/16, train_loss: 0.3307\n", - "4/16, train_loss: 0.0394\n", - "5/16, train_loss: 0.2086\n", - "6/16, train_loss: 0.0699\n", - "7/16, train_loss: 0.2170\n", - "8/16, train_loss: 0.1516\n", - "9/16, train_loss: 0.0540\n", - "10/16, train_loss: 0.4444\n", - "11/16, train_loss: 0.2698\n", - "12/16, train_loss: 0.2102\n", - "13/16, train_loss: 0.1548\n", - "14/16, train_loss: 0.1016\n", - "15/16, train_loss: 0.2114\n", - "16/16, train_loss: 0.2078\n", - "epoch 16 average loss: 0.1974\n", - "current epoch: 16 current mean dice: 0.8994\n", - "best mean dice: 0.9015 at epoch: 14\n", - "----------\n", - "epoch 17/25\n", - "1/16, train_loss: 0.3255\n", - "2/16, train_loss: 0.1636\n", - "3/16, train_loss: 0.3300\n", - "4/16, train_loss: 0.0399\n", - "5/16, train_loss: 0.2085\n", - "6/16, train_loss: 0.0483\n", - "7/16, train_loss: 0.2150\n", - "8/16, train_loss: 0.1506\n", - "9/16, train_loss: 0.0446\n", - "10/16, train_loss: 0.4445\n", - "11/16, train_loss: 0.2692\n", - "12/16, train_loss: 0.2077\n", - "13/16, train_loss: 0.1515\n", - "14/16, train_loss: 0.0980\n", - "15/16, train_loss: 0.2110\n", - "16/16, train_loss: 0.2076\n", - "epoch 17 average loss: 0.1947\n", - "----------\n", - "epoch 18/25\n", - "1/16, train_loss: 0.3255\n", - "2/16, train_loss: 0.1614\n", - "3/16, train_loss: 0.3297\n", - "4/16, train_loss: 0.0381\n", - "5/16, train_loss: 0.2081\n", - "6/16, train_loss: 0.0422\n", - "7/16, train_loss: 0.2152\n", - "8/16, train_loss: 0.1485\n", - "9/16, train_loss: 0.0415\n", - "10/16, train_loss: 0.4442\n", - "11/16, train_loss: 0.2690\n", - "12/16, train_loss: 0.2070\n", - "13/16, train_loss: 0.1515\n", - "14/16, train_loss: 0.0980\n", - "15/16, train_loss: 0.2112\n", - "16/16, train_loss: 0.2068\n", - "epoch 18 average loss: 0.1936\n", - "current epoch: 18 current mean dice: 0.8991\n", - "best mean dice: 0.9015 at epoch: 14\n", - "----------\n", - "epoch 19/25\n", - "1/16, train_loss: 0.3254\n", - "2/16, train_loss: 0.1635\n", - "3/16, train_loss: 0.3297\n", - "4/16, train_loss: 0.0372\n", - "5/16, train_loss: 0.2078\n", - "6/16, train_loss: 0.0424\n", - "7/16, train_loss: 0.2145\n", - "8/16, train_loss: 0.1483\n", - "9/16, train_loss: 0.0402\n", - "10/16, train_loss: 0.4436\n", - "11/16, train_loss: 0.2695\n", - "12/16, train_loss: 0.2076\n", - "13/16, train_loss: 0.1514\n", - "14/16, train_loss: 0.1009\n", - "15/16, train_loss: 0.2116\n", - "16/16, train_loss: 0.2071\n", - "epoch 19 average loss: 0.1938\n", - "----------\n", - "epoch 20/25\n", - "1/16, train_loss: 0.3256\n", - "2/16, train_loss: 0.1616\n", - "3/16, train_loss: 0.3302\n", - "4/16, train_loss: 0.0376\n", - "5/16, train_loss: 0.2080\n", - "6/16, train_loss: 0.0756\n", - "7/16, train_loss: 0.2150\n", - "8/16, train_loss: 0.1476\n", - "9/16, train_loss: 0.0400\n", - "10/16, train_loss: 0.4440\n", - "11/16, train_loss: 0.2686\n", - "12/16, train_loss: 0.2071\n", - "13/16, train_loss: 0.1512\n", - "14/16, train_loss: 0.0990\n", - "15/16, train_loss: 0.2103\n", - "16/16, train_loss: 0.2066\n", - "epoch 20 average loss: 0.1955\n", - "current epoch: 20 current mean dice: 0.8984\n", - "best mean dice: 0.9015 at epoch: 14\n", - "----------\n", - "epoch 21/25\n", - "1/16, train_loss: 0.3253\n", - "2/16, train_loss: 0.1599\n", - "3/16, train_loss: 0.3295\n", - "4/16, train_loss: 0.0370\n", - "5/16, train_loss: 0.2074\n", - "6/16, train_loss: 0.0587\n", - "7/16, train_loss: 0.2138\n", - "8/16, train_loss: 0.1483\n", - "9/16, train_loss: 0.0479\n", - "10/16, train_loss: 0.4449\n", - "11/16, train_loss: 0.2684\n", - "12/16, train_loss: 0.2082\n", - "13/16, train_loss: 0.1520\n", - "14/16, train_loss: 0.1122\n", - "15/16, train_loss: 0.2110\n", - "16/16, train_loss: 0.2088\n", - "epoch 21 average loss: 0.1958\n", - "----------\n", - "epoch 22/25\n", - "1/16, train_loss: 0.3258\n", - "2/16, train_loss: 0.1628\n", - "3/16, train_loss: 0.3298\n", - "4/16, train_loss: 0.0395\n", - "5/16, train_loss: 0.2082\n", - "6/16, train_loss: 0.0614\n", - "7/16, train_loss: 0.2181\n", - "8/16, train_loss: 0.1566\n", - "9/16, train_loss: 0.0650\n", - "10/16, train_loss: 0.4442\n", - "11/16, train_loss: 0.2693\n", - "12/16, train_loss: 0.2118\n", - "13/16, train_loss: 0.1532\n", - "14/16, train_loss: 0.0998\n", - "15/16, train_loss: 0.2121\n", - "16/16, train_loss: 0.2076\n", - "epoch 22 average loss: 0.1978\n", - "saved new best metric model\n", - "current epoch: 22 current mean dice: 0.9054\n", - "best mean dice: 0.9054 at epoch: 22\n", - "----------\n", - "epoch 23/25\n", - "1/16, train_loss: 0.3266\n", - "2/16, train_loss: 0.1723\n", - "3/16, train_loss: 0.3315\n", - "4/16, train_loss: 0.0413\n", - "5/16, train_loss: 0.2091\n", - "6/16, train_loss: 0.0807\n", - "7/16, train_loss: 0.2143\n", - "8/16, train_loss: 0.1514\n", - "9/16, train_loss: 0.0432\n", - "10/16, train_loss: 0.4441\n", - "11/16, train_loss: 0.2704\n", - "12/16, train_loss: 0.2081\n", - "13/16, train_loss: 0.1532\n", - "14/16, train_loss: 0.0983\n", - "15/16, train_loss: 0.2106\n", - "16/16, train_loss: 0.2072\n", - "epoch 23 average loss: 0.1976\n", - "----------\n", - "epoch 24/25\n", - "1/16, train_loss: 0.3257\n", - "2/16, train_loss: 0.1711\n", - "3/16, train_loss: 0.3307\n", - "4/16, train_loss: 0.0376\n", - "5/16, train_loss: 0.2077\n", - "6/16, train_loss: 0.0705\n", - "7/16, train_loss: 0.2141\n", - "8/16, train_loss: 0.1482\n", - "9/16, train_loss: 0.0392\n", - "10/16, train_loss: 0.4439\n", - "11/16, train_loss: 0.2688\n", - "12/16, train_loss: 0.2070\n", - "13/16, train_loss: 0.1512\n", - "14/16, train_loss: 0.0969\n", - "15/16, train_loss: 0.2098\n", - "16/16, train_loss: 0.2062\n", - "epoch 24 average loss: 0.1955\n", - "saved new best metric model\n", - "current epoch: 24 current mean dice: 0.9060\n", - "best mean dice: 0.9060 at epoch: 24\n", - "----------\n", - "epoch 25/25\n", - "1/16, train_loss: 0.3251\n", - "2/16, train_loss: 0.1621\n", - "3/16, train_loss: 0.3298\n", - "4/16, train_loss: 0.0367\n", - "5/16, train_loss: 0.2075\n", - "6/16, train_loss: 0.0430\n", - "7/16, train_loss: 0.2132\n", - "8/16, train_loss: 0.1490\n", - "9/16, train_loss: 0.0390\n", - "10/16, train_loss: 0.4432\n", - "11/16, train_loss: 0.2699\n", - "12/16, train_loss: 0.2080\n", - "13/16, train_loss: 0.1520\n", - "14/16, train_loss: 0.0959\n", - "15/16, train_loss: 0.2101\n", - "16/16, train_loss: 0.2057\n", - "epoch 25 average loss: 0.1931\n", - "train completed, best_metric: 0.9060 at epoch: 24\n" - ] - } - ], + "outputs": [], "source": [ "max_epochs = 25\n", "val_interval = 2\n", @@ -1443,23 +672,10 @@ }, { "cell_type": "code", - "execution_count": 24, + "execution_count": null, "id": "5cf1fd04", "metadata": {}, - "outputs": [ - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "plt.figure(\"train\", (12, 6))\n", "plt.subplot(1, 2, 1)\n", @@ -1498,21 +714,10 @@ }, { "cell_type": "code", - "execution_count": 25, + "execution_count": null, "id": "29441405", "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 25, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "model.load_state_dict(torch.load('monai_data/best_metric_model_pretrained.pth'))" ] @@ -1527,7 +732,7 @@ }, { "cell_type": "code", - "execution_count": 26, + "execution_count": null, "id": "94615f38", "metadata": {}, "outputs": [], @@ -1552,33 +757,10 @@ }, { "cell_type": "code", - "execution_count": 27, + "execution_count": null, "id": "a3f78dd4", "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 27, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "fig = plt.figure(frameon=False, figsize=(7,7))\n", "plt.title('Trained Calculated Spleen')\n", @@ -1587,33 +769,10 @@ }, { "cell_type": "code", - "execution_count": 28, + "execution_count": null, "id": "a67f89f2", "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 28, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "fig = plt.figure(frameon=False, figsize=(7,7))\n", "plt.title('Differences Between Actual and Model')\n", @@ -1623,33 +782,10 @@ }, { "cell_type": "code", - "execution_count": 29, + "execution_count": null, "id": "382c7285", "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 29, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "fig = plt.figure(frameon=False, figsize=(7,7))\n", "plt.title('Differences Between The Models')\n", @@ -1675,33 +811,10 @@ }, { "cell_type": "code", - "execution_count": 30, + "execution_count": null, "id": "91e83d40", "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 30, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "maskedspleen = np.ma.masked_where(test_outputsSpl[0].cpu().numpy()[1][:,:,200] == 0, test_outputsSpl[0].cpu().numpy()[1][:,:,200])\n", "fig = plt.figure(frameon=False, figsize=(10,10))\n", @@ -1729,7 +842,7 @@ }, { "cell_type": "code", - "execution_count": 31, + "execution_count": null, "id": "657e44a0", "metadata": {}, "outputs": [], @@ -1748,28 +861,10 @@ }, { "cell_type": "code", - "execution_count": 32, + "execution_count": null, "id": "a6fb0da7", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "2022-04-27 15:06:54,404 - INFO - Expected md5 is None, skip md5 check for file monai_data/clara_pt_liver_and_tumor_ct_segmentation_1.zip.\n", - "2022-04-27 15:06:54,405 - INFO - File exists: monai_data/clara_pt_liver_and_tumor_ct_segmentation_1.zip, skipped downloading.\n", - "2022-04-27 15:06:54,425 - INFO - Non-empty folder exists in monai_data/clara_pt_liver_and_tumor_ct_segmentation, skipped extracting.\n", - "2022-04-27 15:06:54,426 - INFO - \n", - "*** \"clara_pt_liver_and_tumor_ct_segmentation\" available at monai_data/clara_pt_liver_and_tumor_ct_segmentation.\n", - "2022-04-27 15:06:54,889 - INFO - *** Model: \n", - "2022-04-27 15:06:54,938 - INFO - *** Model params: {'dimensions': 3, 'in_channels': 1, 'out_channels': 3, 'channels': [16, 32, 64, 128, 256], 'strides': [2, 2, 2, 2], 'num_res_units': 2, 'norm': 'batch'}\n", - "2022-04-27 15:06:54,950 - INFO - \n", - "---\n", - "2022-04-27 15:06:54,951 - INFO - For more information, please visit https://ngc.nvidia.com/catalog/models/nvidia:med:clara_pt_liver_and_tumor_ct_segmentation\n", - "\n" - ] - } - ], + "outputs": [], "source": [ " try: #MONAI=0.8\n", " unet_model = load_from_mmar(\n", @@ -1789,29 +884,10 @@ }, { "cell_type": "code", - "execution_count": 33, + "execution_count": null, "id": "55034354", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "using a pretrained model.\n", - "2022-04-27 15:06:55,931 - INFO - Expected md5 is None, skip md5 check for file monai_data/clara_pt_liver_and_tumor_ct_segmentation_1.zip.\n", - "2022-04-27 15:06:55,931 - INFO - File exists: monai_data/clara_pt_liver_and_tumor_ct_segmentation_1.zip, skipped downloading.\n", - "2022-04-27 15:06:55,932 - INFO - Non-empty folder exists in monai_data/clara_pt_liver_and_tumor_ct_segmentation, skipped extracting.\n", - "2022-04-27 15:06:55,933 - INFO - \n", - "*** \"clara_pt_liver_and_tumor_ct_segmentation\" available at monai_data/clara_pt_liver_and_tumor_ct_segmentation.\n", - "2022-04-27 15:06:55,962 - INFO - *** Model: \n", - "2022-04-27 15:06:56,010 - INFO - *** Model params: {'dimensions': 3, 'in_channels': 1, 'out_channels': 3, 'channels': [16, 32, 64, 128, 256], 'strides': [2, 2, 2, 2], 'num_res_units': 2, 'norm': 'batch'}\n", - "2022-04-27 15:06:56,023 - INFO - \n", - "---\n", - "2022-04-27 15:06:56,024 - INFO - For more information, please visit https://ngc.nvidia.com/catalog/models/nvidia:med:clara_pt_liver_and_tumor_ct_segmentation\n", - "\n" - ] - } - ], + "outputs": [], "source": [ "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n", "\n", @@ -1834,7 +910,7 @@ }, { "cell_type": "code", - "execution_count": 34, + "execution_count": null, "id": "a79c1731", "metadata": {}, "outputs": [], @@ -1860,33 +936,10 @@ }, { "cell_type": "code", - "execution_count": 35, + "execution_count": null, "id": "c0956706", "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 35, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "sliceval = 215\n", "maskedliv = np.ma.masked_where(test_outputsliv[0].cpu().numpy()[1][:,:,sliceval] == 0, test_outputsliv[0].cpu().numpy()[1][:,:,sliceval])\n", @@ -1900,33 +953,10 @@ }, { "cell_type": "code", - "execution_count": 36, + "execution_count": null, "id": "5bdfdbe9", "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 36, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], + "outputs": [], "source": [ "sliceval = 110\n", "maskedliv = np.ma.masked_where(test_outputsliv[0].cpu().numpy()[1][:,sliceval,:] == 0, test_outputsliv[0].cpu().numpy()[1][:,sliceval,:])\n", @@ -1949,28 +979,20 @@ ] }, { - "cell_type": "code", - "execution_count": null, - "id": "0dce4d55", - "metadata": {}, - "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e17e6228", + "cell_type": "markdown", "metadata": {}, - "outputs": [], - "source": [] + "source": [ + "## Conclusions\n", + "Here you learned how to use NVIDIA pre-trained models for image segmentation" + ] }, { - "cell_type": "code", - "execution_count": null, - "id": "7034135a", + "cell_type": "markdown", "metadata": {}, - "outputs": [], - "source": [] + "source": [ + "## Clean up\n", + "Shut down your compute environment and delete any resource groups associated with this notebook." + ] } ], "metadata": { diff --git a/notebooks/rnaseq-myco-tutorial-main/RNAseq_pipeline.ipynb b/notebooks/rnaseq-myco-tutorial-main/RNAseq_pipeline.ipynb index efaa963..fe594c4 100644 --- a/notebooks/rnaseq-myco-tutorial-main/RNAseq_pipeline.ipynb +++ b/notebooks/rnaseq-myco-tutorial-main/RNAseq_pipeline.ipynb @@ -2,41 +2,66 @@ "cells": [ { "cell_type": "markdown", + "metadata": {}, "source": [ "# RNA-Seq Analysis Training Demo on Azure" - ], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "## Overview" - ], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "This short tutorial demonstrates how to run an RNA-Seq workflow using a prokaryotic data set. Steps in the workflow include read trimming, read QC, read mapping, and counting mapped reads per gene to quantitative gene expression." - ], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "### STEP 1: Setup Environment" - ], - "metadata": {} + "## Prerequisites\n", + "We assume you have provisioned a compute environment in Azure ML Studio" + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "Note that within Jupyter you can run a bash comman either by using the magic '!' in front of your command, or by adding %%bash to the top of your cell." - ], - "metadata": {} + "## Learning objectives\n", + "+ Learn how to copy data to and from Blob storage\n", + "+ Learn how to run and visualize basic RNAseq analysis" + ] }, { "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Get started" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Install packages" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note that within Jupyter you can run a bash command either by using the magic '!' in front of your command, or by adding %%bash to the top of your cell." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, "source": [ "For example\n", "```\n", @@ -47,98 +72,82 @@ "```\n", "!example command\n", "```" - ], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "The first step is to install mamba forge, which is the newer and faster version of the conda package manager." - ], - "metadata": {} + "The first step is to install mambaforge, which is the newer and faster version of the conda package manager." + ] }, { "cell_type": "code", - "source": [ - "!curl -L -O https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh\n", - "!bash Mambaforge-$(uname)-$(uname -m).sh -b -p $HOME/mambaforge" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": " % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\n 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\n100 82.9M 100 82.9M 0 0 115M 0 --:--:-- --:--:-- --:--:-- 198M\nERROR: File or directory already exists: '/home/azureuser/mambaforge'\nIf you want to update an existing installation, use the -u option.\n" - } - ], - "execution_count": 1, + "execution_count": null, "metadata": { "tags": [] - } + }, + "outputs": [], + "source": [ + "! curl -L -O https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh\n", + "! bash Mambaforge-$(uname)-$(uname -m).sh -b -p $HOME/mambaforge" + ] }, { "cell_type": "code", - "source": [ - "#add to your path\n", - "import os\n", - "os.environ[\"PATH\"] += os.pathsep + os.environ[\"HOME\"]+\"/mambaforge/bin\"" - ], - "outputs": [], - "execution_count": 2, + "execution_count": null, "metadata": { "gather": { "logged": 1682515170386 } - } + }, + "outputs": [], + "source": [ + "#add to your path\n", + "import os\n", + "os.environ[\"PATH\"] += os.pathsep + os.environ[\"HOME\"]+\"/mambaforge/bin\"" + ] }, { "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "! mamba info --envs" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": "\n __ __ __ __\n / \\ / \\ / \\ / \\\n / \\/ \\/ \\/ \\\n███████████████/ /██/ /██/ /██/ /████████████████████████\n / / \\ / \\ / \\ / \\ \\____\n / / \\_/ \\_/ \\_/ \\ o \\__,\n / _/ \\_____/ `\n |/\n ███╗ ███╗ █████╗ ███╗ ███╗██████╗ █████╗\n ████╗ ████║██╔══██╗████╗ ████║██╔══██╗██╔══██╗\n ██╔████╔██║███████║██╔████╔██║██████╔╝███████║\n ██║╚██╔╝██║██╔══██║██║╚██╔╝██║██╔══██╗██╔══██║\n ██║ ╚═╝ ██║██║ ██║██║ ╚═╝ ██║██████╔╝██║ ██║\n ╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝╚═════╝ ╚═╝ ╚═╝\n\n mamba (1.1.0) supported by @QuantStack\n\n GitHub: https://github.com/mamba-org/mamba\n Twitter: https://twitter.com/QuantStack\n\n█████████████████████████████████████████████████████████████\n\n# conda environments:\n#\n /anaconda\nbase /home/azureuser/mambaforge\n\n" - } - ], - "execution_count": 3, - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "Next, we will install the necessary packages into the current environment." - ], - "metadata": {} + ] }, { "cell_type": "code", - "source": [ - "! mamba install -c conda-forge -c bioconda -c defaults -y sra-tools pigz pbzip2 fastp fastqc multiqc salmon" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": "\n __ __ __ __\n / \\ / \\ / \\ / \\\n / \\/ \\/ \\/ \\\n███████████████/ /██/ /██/ /██/ /████████████████████████\n / / \\ / \\ / \\ / \\ \\____\n / / \\_/ \\_/ \\_/ \\ o \\__,\n / _/ \\_____/ `\n |/\n ███╗ ███╗ █████╗ ███╗ ███╗██████╗ █████╗\n ████╗ ████║██╔══██╗████╗ ████║██╔══██╗██╔══██╗\n ██╔████╔██║███████║██╔████╔██║██████╔╝███████║\n ██║╚██╔╝██║██╔══██║██║╚██╔╝██║██╔══██╗██╔══██║\n ██║ ╚═╝ ██║██║ ██║██║ ╚═╝ ██║██████╔╝██║ ██║\n ╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝╚═════╝ ╚═╝ ╚═╝\n\n mamba (1.1.0) supported by @QuantStack\n\n GitHub: https://github.com/mamba-org/mamba\n Twitter: https://twitter.com/QuantStack\n\n█████████████████████████████████████████████████████████████\n\n\nLooking for: ['sra-tools', 'pigz=2.6', 'pbzip2=1.1', 'fastp=0.23.2', 'fastqc=0.11.9', 'multiqc', 'salmon=1.5.1']\n\n\u001b[?25l\u001b[2K\u001b[0G[+] 0.0s\n\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 0.1s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━━━\u001b[0m 0.0 B / ??.?MB @ ??.?MB/s 0.1s\nconda-forge/noarch \u001b[90m━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━━\u001b[0m 0.0 B / ??.?MB @ ??.?MB/s 0.1s\nbioconda/linux-64 \u001b[33m━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━\u001b[0m 0.0 B / ??.?MB @ ??.?MB/s 0.1s\nbioconda/noarch \u001b[90m━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━\u001b[0m 0.0 B / ??.?MB @ ??.?MB/s 0.1s\npkgs/main/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 0.0 B / ??.?MB @ ??.?MB/s 0.1s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0Gpkgs/main/linux-64 No change\nbioconda/linux-64 No change\npkgs/r/noarch No change\nbioconda/noarch No change\npkgs/main/noarch No change\npkgs/r/linux-64 No change\n[+] 0.2s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━\u001b[0m 123.0kB / ??.?MB @ 756.3kB/s 0.2s\nconda-forge/noarch \u001b[90m━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━\u001b[0m 182.2kB / ??.?MB @ 1.1MB/s 0.2s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 0.3s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 539.1kB / ??.?MB @ 2.0MB/s 0.3s\nconda-forge/noarch \u001b[90m━━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━\u001b[0m 734.4kB / ??.?MB @ 2.8MB/s 0.3s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 0.4s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━\u001b[0m 1.0MB / ??.?MB @ 2.8MB/s 0.4s\nconda-forge/noarch \u001b[90m━━━━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━\u001b[0m 1.3MB / ??.?MB @ 3.6MB/s 0.4s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 0.5s\nconda-forge/linux-64 \u001b[90m━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━\u001b[0m 1.5MB / ??.?MB @ 3.1MB/s 0.5s\nconda-forge/noarch \u001b[33m━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━━━━\u001b[0m 1.9MB / ??.?MB @ 4.1MB/s 0.5s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 0.6s\nconda-forge/linux-64 \u001b[90m━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━\u001b[0m 2.0MB / ??.?MB @ 3.5MB/s 0.6s\nconda-forge/noarch \u001b[33m━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━\u001b[0m 2.4MB / ??.?MB @ 4.2MB/s 0.6s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 0.7s\nconda-forge/linux-64 \u001b[90m━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━━\u001b[0m 2.4MB / ??.?MB @ 3.6MB/s 0.7s\nconda-forge/noarch \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 2.9MB / ??.?MB @ 4.3MB/s 0.7s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 0.8s\nconda-forge/linux-64 \u001b[90m━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━\u001b[0m 2.9MB / ??.?MB @ 3.8MB/s 0.8s\nconda-forge/noarch \u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━\u001b[0m 3.4MB / ??.?MB @ 4.4MB/s 0.8s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 0.9s\nconda-forge/linux-64 \u001b[90m━━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━\u001b[0m 3.4MB / ??.?MB @ 3.9MB/s 0.9s\nconda-forge/noarch \u001b[90m━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━\u001b[0m 3.9MB / ??.?MB @ 4.5MB/s 0.9s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 1.0s\nconda-forge/linux-64 \u001b[90m━━━━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━\u001b[0m 3.9MB / ??.?MB @ 4.0MB/s 1.0s\nconda-forge/noarch \u001b[90m━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━\u001b[0m 4.4MB / ??.?MB @ 4.6MB/s 1.0s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 1.1s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━━━\u001b[0m 4.3MB / ??.?MB @ 4.0MB/s 1.1s\nconda-forge/noarch \u001b[90m━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━━\u001b[0m 4.8MB / ??.?MB @ 4.5MB/s 1.1s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 1.2s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━\u001b[0m 4.8MB / ??.?MB @ 4.1MB/s 1.2s\nconda-forge/noarch \u001b[90m━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━\u001b[0m 5.1MB / ??.?MB @ 4.3MB/s 1.2s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 1.3s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 5.2MB / ??.?MB @ 4.1MB/s 1.3s\nconda-forge/noarch \u001b[90m━━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━\u001b[0m 5.5MB / ??.?MB @ 4.3MB/s 1.3s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 1.4s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━\u001b[0m 5.7MB / ??.?MB @ 4.1MB/s 1.4s\nconda-forge/noarch \u001b[90m━━━━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━\u001b[0m 6.0MB / ??.?MB @ 4.4MB/s 1.4s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 1.5s\nconda-forge/linux-64 \u001b[90m━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━\u001b[0m 6.2MB / ??.?MB @ 4.2MB/s 1.5s\nconda-forge/noarch \u001b[33m━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━━━━\u001b[0m 6.5MB / ??.?MB @ 4.4MB/s 1.5s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 1.6s\nconda-forge/linux-64 \u001b[90m━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━\u001b[0m 6.7MB / ??.?MB @ 4.2MB/s 1.6s\nconda-forge/noarch \u001b[33m━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━\u001b[0m 7.0MB / ??.?MB @ 4.4MB/s 1.6s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 1.7s\nconda-forge/linux-64 \u001b[90m━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━━\u001b[0m 7.1MB / ??.?MB @ 4.3MB/s 1.7s\nconda-forge/noarch \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 7.5MB / ??.?MB @ 4.4MB/s 1.7s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 1.8s\nconda-forge/linux-64 \u001b[90m━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━\u001b[0m 7.7MB / ??.?MB @ 4.3MB/s 1.8s\nconda-forge/noarch \u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━\u001b[0m 7.8MB / ??.?MB @ 4.4MB/s 1.8s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 1.9s\nconda-forge/linux-64 \u001b[90m━━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━\u001b[0m 8.1MB / ??.?MB @ 4.3MB/s 1.9s\nconda-forge/noarch \u001b[90m━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━\u001b[0m 8.3MB / ??.?MB @ 4.4MB/s 1.9s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 2.0s\nconda-forge/linux-64 \u001b[90m━━━━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━\u001b[0m 8.6MB / ??.?MB @ 4.3MB/s 2.0s\nconda-forge/noarch \u001b[90m━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━\u001b[0m 8.7MB / ??.?MB @ 4.4MB/s 2.0s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 2.1s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━━━\u001b[0m 9.1MB / ??.?MB @ 4.4MB/s 2.1s\nconda-forge/noarch \u001b[90m━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━━\u001b[0m 9.1MB / ??.?MB @ 4.4MB/s 2.1s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 2.2s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━\u001b[0m 9.6MB / ??.?MB @ 4.4MB/s 2.2s\nconda-forge/noarch \u001b[90m━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━\u001b[0m 9.4MB / ??.?MB @ 4.3MB/s 2.2s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 2.3s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 10.1MB / ??.?MB @ 4.4MB/s 2.3s\nconda-forge/noarch \u001b[90m━━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━\u001b[0m 9.9MB / ??.?MB @ 4.3MB/s 2.3s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 2.4s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━\u001b[0m 10.6MB / ??.?MB @ 4.5MB/s 2.4s\nconda-forge/noarch \u001b[90m━━━━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━\u001b[0m 10.4MB / ??.?MB @ 4.3MB/s 2.4s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 2.5s\nconda-forge/linux-64 \u001b[90m━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━\u001b[0m 11.1MB / ??.?MB @ 4.5MB/s 2.5s\nconda-forge/noarch \u001b[33m━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━━━━\u001b[0m 10.8MB / ??.?MB @ 4.4MB/s 2.5s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 2.6s\nconda-forge/linux-64 \u001b[90m━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━\u001b[0m 11.6MB / ??.?MB @ 4.5MB/s 2.6s\nconda-forge/noarch \u001b[33m━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━\u001b[0m 11.3MB / ??.?MB @ 4.4MB/s 2.6s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 2.7s\nconda-forge/linux-64 \u001b[90m━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━━\u001b[0m 12.1MB / ??.?MB @ 4.5MB/s 2.7s\nconda-forge/noarch \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 11.8MB / ??.?MB @ 4.4MB/s 2.7s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 2.8s\nconda-forge/linux-64 \u001b[90m━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━\u001b[0m 12.4MB / ??.?MB @ 4.5MB/s 2.8s\nconda-forge/noarch \u001b[33m━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━\u001b[0m 12.0MB / ??.?MB @ 4.4MB/s 2.8s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 2.9s\nconda-forge/linux-64 \u001b[90m━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━\u001b[0m 12.4MB @ 4.5MB/s 2.9s\nconda-forge/noarch ━━━━━━━━━━━━━━━━━━━━━━ 12.0MB @ 4.4MB/s Finalizing 2.9s\u001b[2K\u001b[1A\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 3.0s\nconda-forge/linux-64 \u001b[90m━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━\u001b[0m 12.4MB / ??.?MB @ 4.5MB/s 3.0s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 3.1s\nconda-forge/linux-64 \u001b[90m━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━\u001b[0m 12.4MB / ??.?MB @ 4.5MB/s 3.1s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 3.2s\nconda-forge/linux-64 \u001b[90m━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━\u001b[0m 12.4MB / ??.?MB @ 4.5MB/s 3.2s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 3.3s\nconda-forge/linux-64 \u001b[90m━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━\u001b[0m 12.4MB / ??.?MB @ 4.5MB/s 3.3s\u001b[2K\u001b[1A\u001b[2K\u001b[0Gconda-forge/noarch @ 4.4MB/s 2.9s\n[+] 3.4s\nconda-forge/linux-64 \u001b[90m━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━\u001b[0m 12.4MB / ??.?MB @ 3.7MB/s 3.4s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 3.5s\nconda-forge/linux-64 \u001b[90m━━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━\u001b[0m 13.7MB / ??.?MB @ 4.0MB/s 3.5s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 3.6s\nconda-forge/linux-64 \u001b[90m━━━━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━\u001b[0m 14.3MB / ??.?MB @ 4.0MB/s 3.6s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 3.7s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━━━\u001b[0m 14.8MB / ??.?MB @ 4.0MB/s 3.7s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 3.8s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━\u001b[0m 15.3MB / ??.?MB @ 4.1MB/s 3.8s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 3.9s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 15.8MB / ??.?MB @ 4.1MB/s 3.9s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 4.0s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━\u001b[0m 16.4MB / ??.?MB @ 4.1MB/s 4.0s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 4.1s\nconda-forge/linux-64 \u001b[90m━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━\u001b[0m 16.9MB / ??.?MB @ 4.2MB/s 4.1s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 4.2s\nconda-forge/linux-64 \u001b[90m━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━\u001b[0m 17.4MB / ??.?MB @ 4.2MB/s 4.2s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 4.3s\nconda-forge/linux-64 \u001b[90m━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━━\u001b[0m 17.9MB / ??.?MB @ 4.2MB/s 4.3s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 4.4s\nconda-forge/linux-64 \u001b[90m━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━\u001b[0m 18.4MB / ??.?MB @ 4.2MB/s 4.4s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 4.5s\nconda-forge/linux-64 \u001b[90m━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━\u001b[0m 18.4MB / ??.?MB @ 4.2MB/s 4.5s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 4.6s\nconda-forge/linux-64 \u001b[90m━━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━\u001b[0m 19.4MB / ??.?MB @ 4.3MB/s 4.6s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 4.7s\nconda-forge/linux-64 \u001b[90m━━━━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━\u001b[0m 19.9MB / ??.?MB @ 4.3MB/s 4.7s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 4.8s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━━━\u001b[0m 20.4MB / ??.?MB @ 4.3MB/s 4.8s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 4.9s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━\u001b[0m 20.9MB / ??.?MB @ 4.3MB/s 4.9s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 5.0s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 21.4MB / ??.?MB @ 4.3MB/s 5.0s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 5.1s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━\u001b[0m 21.9MB / ??.?MB @ 4.3MB/s 5.1s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 5.2s\nconda-forge/linux-64 \u001b[90m━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━\u001b[0m 22.3MB / ??.?MB @ 4.3MB/s 5.2s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 5.3s\nconda-forge/linux-64 \u001b[90m━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━\u001b[0m 22.8MB / ??.?MB @ 4.3MB/s 5.3s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 5.4s\nconda-forge/linux-64 \u001b[90m━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━━\u001b[0m 23.3MB / ??.?MB @ 4.3MB/s 5.4s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 5.5s\nconda-forge/linux-64 \u001b[90m━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━\u001b[0m 23.8MB / ??.?MB @ 4.3MB/s 5.5s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 5.6s\nconda-forge/linux-64 \u001b[90m━━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━\u001b[0m 24.3MB / ??.?MB @ 4.4MB/s 5.6s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 5.7s\nconda-forge/linux-64 \u001b[90m━━━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━\u001b[0m 24.5MB / ??.?MB @ 4.3MB/s 5.7s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 5.8s\nconda-forge/linux-64 \u001b[33m━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━━━━━\u001b[0m 25.0MB / ??.?MB @ 4.3MB/s 5.8s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 5.9s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━━\u001b[0m 25.4MB / ??.?MB @ 4.3MB/s 5.9s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 6.0s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━\u001b[0m 25.9MB / ??.?MB @ 4.3MB/s 6.0s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 6.1s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━\u001b[0m 26.4MB / ??.?MB @ 4.3MB/s 6.1s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 6.2s\nconda-forge/linux-64 \u001b[90m╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━\u001b[0m 26.9MB / ??.?MB @ 4.3MB/s 6.2s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 6.3s\nconda-forge/linux-64 \u001b[90m━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━\u001b[0m 27.3MB / ??.?MB @ 4.3MB/s 6.3s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 6.4s\nconda-forge/linux-64 \u001b[90m━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━\u001b[0m 27.8MB / ??.?MB @ 4.4MB/s 6.4s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 6.5s\nconda-forge/linux-64 \u001b[90m━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━\u001b[0m 28.2MB / ??.?MB @ 4.3MB/s 6.5s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 6.6s\nconda-forge/linux-64 \u001b[90m━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━━━\u001b[0m 28.2MB / ??.?MB @ 4.3MB/s 6.6s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 6.7s\nconda-forge/linux-64 \u001b[90m━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━━━\u001b[0m 28.9MB / ??.?MB @ 4.3MB/s 6.7s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 6.8s\nconda-forge/linux-64 \u001b[90m━━━━━━━━━━━╸\u001b[0m\u001b[33m━━━━━━━━━━━\u001b[0m 29.3MB / ??.?MB @ 4.3MB/s 6.8s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 6.9s\nconda-forge/linux-64 \u001b[33m━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━━━━━\u001b[0m 29.8MB / ??.?MB @ 4.3MB/s 6.9s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 7.0s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━━━\u001b[0m 30.3MB / ??.?MB @ 4.3MB/s 7.0s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 7.1s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━━\u001b[0m 30.7MB / ??.?MB @ 4.3MB/s 7.1s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 7.2s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 7.2s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 7.3s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 7.3s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 7.4s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 7.4s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 7.5s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 7.5s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 7.6s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 7.6s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 7.7s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 7.7s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 7.8s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 7.8s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 7.9s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 7.9s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 8.0s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 8.0s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 8.1s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 8.1s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 8.2s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 8.2s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 8.3s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 8.3s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 8.4s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 8.4s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 8.5s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 8.5s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 8.6s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 8.6s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 8.7s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 8.7s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 8.8s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 8.8s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 8.9s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 8.9s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 9.0s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 9.0s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 9.1s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 9.1s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 9.2s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 9.2s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 9.3s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 9.3s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 9.4s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 9.4s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 9.5s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 9.5s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 9.6s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 9.6s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 9.7s\nconda-forge/linux-64 \u001b[33m━━━━━━━━━━━━━╸\u001b[0m\u001b[90m━━━━━━━━━\u001b[0m 30.9MB / ??.?MB @ 4.3MB/s 9.7s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 9.8s\nconda-forge/linux-64 ━━━━━━━━━━━━━━━━━━━━━━ 31.1MB @ 4.3MB/s Finalizing 9.8s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 9.9s\nconda-forge/linux-64 ━━━━━━━━━━━━━━━━━━━━━━ 31.1MB @ 4.3MB/s Finalizing 9.9s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 10.0s\nconda-forge/linux-64 ━━━━━━━━━━━━━━━━━━━━━━ 31.1MB @ 4.3MB/s Finalizing 10.0s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 10.1s\nconda-forge/linux-64 ━━━━━━━━━━━━━━━━━━━━━━ 31.1MB @ 4.3MB/s Finalizing 10.1s\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 10.2s\n\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 10.3s\n\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 10.4s\n\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 10.5s\n\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 10.6s\n\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 10.7s\n\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 10.8s\n\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 10.9s\n\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 11.0s\n\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 11.1s\n\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 11.2s\n\u001b[2K\u001b[1A\u001b[2K\u001b[0G[+] 11.3s\n\u001b[2K\u001b[1A\u001b[2K\u001b[0Gconda-forge/linux-64 @ 4.3MB/s 10.1s\n\u001b[?25h\nPinned packages:\n - python 3.10.*\n\n\nTransaction\n\n Prefix: /home/azureuser/mambaforge\n\n All requested packages already installed\n\n\u001b[?25l\u001b[2K\u001b[0G\u001b[?25h" - } - ], - "execution_count": 17, + "execution_count": null, "metadata": { "scrolled": true, "tags": [] - } + }, + "outputs": [], + "source": [ + "! mamba install -c conda-forge -c bioconda -c defaults -y sra-tools pigz pbzip2 fastp fastqc multiqc salmon" + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "Create a set of directories to store the reads, reference sequence files, and output files.\n" - ], - "metadata": {} + ] }, { "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "%%bash\n", "mkdir -p data\n", @@ -148,409 +157,337 @@ "mkdir -p data/aligned\n", "mkdir -p data/reference\n", "mkdir -p data/quants" - ], - "outputs": [], - "execution_count": 33, - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "### STEP 2: Copy FASTQ Files\n", + "### Copy FASTQ Files\n", "In order for this tutorial to run quickly, we will only analyze 50,000 reads from a sample from both sample groups instead of analyzing all the reads from all six samples. These files have been posted on a Azure Blob storage containers that we made publicly accessible." - ], - "metadata": {} + ] }, { "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "!curl https://storeshare.blob.core.windows.net/publicdata/testsample/RNAseq/raw_fastq/SRR13349122_1.fastq --output data/raw_fastq/SRR13349122_1.fastq\n", "!curl https://storeshare.blob.core.windows.net/publicdata/testsample/RNAseq/raw_fastq/SRR13349122_2.fastq --output data/raw_fastq/SRR13349122_2.fastq\n", "!curl https://storeshare.blob.core.windows.net/publicdata/testsample/RNAseq/raw_fastq/SRR13349128_1.fastq --output data/raw_fastq/SRR13349128_1.fastq\n", "!curl https://storeshare.blob.core.windows.net/publicdata/testsample/RNAseq/raw_fastq/SRR13349128_2.fastq --output data/raw_fastq/SRR13349128_2.fastq" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": " % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n100 8452k 100 8452k 0 0 10.4M 0 --:--:-- --:--:-- --:--:-- 10.4M\n % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n100 8452k 100 8452k 0 0 9328k 0 --:--:-- --:--:-- --:--:-- 9319k\n % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n100 8452k 100 8452k 0 0 11.1M 0 --:--:-- --:--:-- --:--:-- 11.1M\n % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n100 8452k 100 8452k 0 0 12.7M 0 --:--:-- --:--:-- --:--:-- 12.7M\n" - } - ], - "execution_count": 6, - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "### STEP 3: Copy reference transcriptome files that will be used by Salmon\n", + "### Copy reference transcriptome files that will be used by Salmon\n", "Salmon is a tool that aligns RNA-Seq reads to a set of transcripts rather than the entire genome." - ], - "metadata": {} + ] }, { "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "!curl https://storeshare.blob.core.windows.net/publicdata/testsample/RNAseq/reference/M_chelonae_transcripts.fasta --output data/reference/M_chelonae_transcripts.fasta\n", "!curl https://storeshare.blob.core.windows.net/publicdata/testsample/RNAseq/reference/decoys.txt --output data/reference/decoys.txt" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": " % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n100 9599k 100 9599k 0 0 12.3M 0 --:--:-- --:--:-- --:--:-- 12.3M\n % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n100 14 100 14 0 0 76 0 --:--:-- --:--:-- --:--:-- 76\n" - } - ], - "execution_count": 27, - "metadata": {} + ] }, { "cell_type": "code", - "source": [ - "ls data/raw_fastq" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": "\u001b[0m\u001b[01;32mSRR13349122_1.fastq\u001b[0m* \u001b[01;32mSRR13349128_1.fastq\u001b[0m*\r\n\u001b[01;32mSRR13349122_2.fastq\u001b[0m* \u001b[01;32mSRR13349128_2.fastq\u001b[0m*\r\n" - } - ], - "execution_count": 38, + "execution_count": null, "metadata": { + "gather": { + "logged": 1682517580413 + }, "jupyter": { - "source_hidden": false, - "outputs_hidden": false + "outputs_hidden": false, + "source_hidden": false }, "nteract": { "transient": { "deleting": false } - }, - "gather": { - "logged": 1682517580413 } - } + }, + "outputs": [], + "source": [ + "ls data/raw_fastq" + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "### STEP 4: Trim our data with Fastp" - ], - "metadata": {} + "### Trim our data with Fastp" + ] }, { "cell_type": "code", - "source": [ - "! fastp -i data/raw_fastq/SRR13349122_1.fastq -I data/raw_fastq/SRR13349122_2.fastq -o data/trimmed/SRR13349122_1_trimmed.fastq -O data/trimmed/SRR13349122_2_trimmed.fastq\n", - "! fastp -i data/raw_fastq/SRR13349128_1.fastq -I data/raw_fastq/SRR13349128_2.fastq -o data/trimmed/SRR13349128_1_trimmed.fastq -O data/trimmed/SRR13349128_2_trimmed.fastq" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": "Read1 before filtering:\ntotal reads: 50000\ntotal bases: 2550000\nQ20 bases: 2451900(96.1529%)\nQ30 bases: 2370275(92.952%)\n\nRead2 before filtering:\ntotal reads: 50000\ntotal bases: 2550000\nQ20 bases: 2376817(93.2085%)\nQ30 bases: 2255260(88.4416%)\n\nRead1 after filtering:\ntotal reads: 49849\ntotal bases: 2542226\nQ20 bases: 2444408(96.1523%)\nQ30 bases: 2363088(92.9535%)\n\nRead2 after filtering:\ntotal reads: 49849\ntotal bases: 2542226\nQ20 bases: 2374927(93.4192%)\nQ30 bases: 2253977(88.6616%)\n\nFiltering result:\nreads passed filter: 99698\nreads failed due to low quality: 246\nreads failed due to too many N: 56\nreads failed due to too short: 0\nreads with adapter trimmed: 18\nbases trimmed due to adapters: 146\n\nDuplication rate: 23.57%\n\nInsert size peak (evaluated by paired-end reads): 33\n\nJSON report: fastp.json\nHTML report: fastp.html\n\nfastp -i data/raw_fastq/SRR13349122_1.fastq -I data/raw_fastq/SRR13349122_2.fastq -o data/trimmed/SRR13349122_1_trimmed.fastq -O data/trimmed/SRR13349122_2_trimmed.fastq \nfastp v0.23.2, time used: 2 seconds\nRead1 before filtering:\ntotal reads: 50000\ntotal bases: 2550000\nQ20 bases: 2447617(95.985%)\nQ30 bases: 2363073(92.6695%)\n\nRead2 before filtering:\ntotal reads: 50000\ntotal bases: 2550000\nQ20 bases: 2379063(93.2966%)\nQ30 bases: 2258898(88.5842%)\n\nRead1 after filtering:\ntotal reads: 49831\ntotal bases: 2541263\nQ20 bases: 2439163(95.9823%)\nQ30 bases: 2354964(92.669%)\n\nRead2 after filtering:\ntotal reads: 49831\ntotal bases: 2541263\nQ20 bases: 2377253(93.5461%)\nQ30 bases: 2257594(88.8375%)\n\nFiltering result:\nreads passed filter: 99662\nreads failed due to low quality: 284\nreads failed due to too many N: 54\nreads failed due to too short: 0\nreads with adapter trimmed: 26\nbases trimmed due to adapters: 236\n\nDuplication rate: 24.244%\n\nInsert size peak (evaluated by paired-end reads): 70\n\nJSON report: fastp.json\nHTML report: fastp.html\n\nfastp -i data/raw_fastq/SRR13349128_1.fastq -I data/raw_fastq/SRR13349128_2.fastq -o data/trimmed/SRR13349128_1_trimmed.fastq -O data/trimmed/SRR13349128_2_trimmed.fastq \nfastp v0.23.2, time used: 1 seconds\n" - } - ], - "execution_count": 39, + "execution_count": null, "metadata": { "jupyter": { - "source_hidden": false, - "outputs_hidden": false + "outputs_hidden": false, + "source_hidden": false }, "nteract": { "transient": { "deleting": false } } - } + }, + "outputs": [], + "source": [ + "! fastp -i data/raw_fastq/SRR13349122_1.fastq -I data/raw_fastq/SRR13349122_2.fastq -o data/trimmed/SRR13349122_1_trimmed.fastq -O data/trimmed/SRR13349122_2_trimmed.fastq\n", + "! fastp -i data/raw_fastq/SRR13349128_1.fastq -I data/raw_fastq/SRR13349128_2.fastq -o data/trimmed/SRR13349128_1_trimmed.fastq -O data/trimmed/SRR13349128_2_trimmed.fastq" + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "### STEP 6: Run FastQC\n", + "### Run FastQC\n", "FastQC is an invaluable tool that allows you to evaluate whether there are problems with a set of reads. For example, it will provide a report of whether there is any bias in the sequence composition of the reads." - ], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "Once FastQC is done running, look at the outputs in data/fastqc. What can you say about the quality of the two samples we are looking at here? " - ], - "metadata": {} + ] }, { "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "%%bash\n", "fastqc -o data/fastqc data/trimmed/SRR13349122_1_trimmed.fastq\n", "fastqc -o data/fastqc data/trimmed/SRR13349128_1_trimmed.fastq" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stderr", - "text": "Started analysis of SRR13349122_1_trimmed.fastq\nApprox 5% complete for SRR13349122_1_trimmed.fastq\nApprox 10% complete for SRR13349122_1_trimmed.fastq\nApprox 15% complete for SRR13349122_1_trimmed.fastq\nApprox 20% complete for SRR13349122_1_trimmed.fastq\nApprox 25% complete for SRR13349122_1_trimmed.fastq\nApprox 30% complete for SRR13349122_1_trimmed.fastq\nApprox 35% complete for SRR13349122_1_trimmed.fastq\nApprox 40% complete for SRR13349122_1_trimmed.fastq\nApprox 45% complete for SRR13349122_1_trimmed.fastq\nApprox 50% complete for SRR13349122_1_trimmed.fastq\nApprox 55% complete for SRR13349122_1_trimmed.fastq\nApprox 60% complete for SRR13349122_1_trimmed.fastq\nApprox 65% complete for SRR13349122_1_trimmed.fastq\nApprox 70% complete for SRR13349122_1_trimmed.fastq\nApprox 75% complete for SRR13349122_1_trimmed.fastq\nApprox 80% complete for SRR13349122_1_trimmed.fastq\nApprox 85% complete for SRR13349122_1_trimmed.fastq\nApprox 90% complete for SRR13349122_1_trimmed.fastq\nApprox 95% complete for SRR13349122_1_trimmed.fastq\nSkipping 'data/trimmed/SRR13349128_1_trimmed.fastq' which didn't exist, or couldn't be read\n" - }, - { - "output_type": "stream", - "name": "stdout", - "text": "Analysis complete for SRR13349122_1_trimmed.fastq\n" - } - ], - "execution_count": 15, - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "### STEP 7: Run MultiQC\n", + "### Run MultiQC\n", "MultiQC reads in the FastQQ reports and generate a compiled report for all the analyzed FASTQ files.\n", "Just as with fastqc, we can look at the mulitqc results after it finishes at data/multiqc_data" - ], - "metadata": {} + ] }, { "cell_type": "code", - "source": [ - "! multiqc -f data/fastqc -f\n", - "#! mv multiqc_data/ data/" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": "\u001b[1;30m[WARNING]\u001b[0m multiqc : \u001b[33mMultiQC Version v1.14 now available!\u001b[0m\n\u001b[1;30m[INFO ]\u001b[0m multiqc : This is MultiQC v1.10.1\n\u001b[1;30m[INFO ]\u001b[0m multiqc : Template : default\n\u001b[1;30m[INFO ]\u001b[0m multiqc : Searching : /mnt/batch/tasks/shared/LS_root/mounts/clusters/cloud-lab-notebooks/code/Users/oconnellka/NIHCloudLabAzure-main 2/tutorials/notebooks/rnaseq-myco-tutorial-main/data/fastqc\n\u001b[2KSearching \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[35m100%\u001b[0m \u001b[32m2/2\u001b[0m [2mdata/fastqc/SRR13349122_1_trimmed_fastqc.html\u001b[0m\n\u001b[?25h\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'custom_content' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule custom_content raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/custom_content/custom_content.py\", line 87, in custom_module_classes\n bm = BaseMultiqcModule()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'ngsderive' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule ngsderive raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/ngsderive/ngsderive.py\", line 29, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'purple' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule purple raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/purple/purple.py\", line 25, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'conpair' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule conpair raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/conpair/conpair.py\", line 23, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'peddy' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule peddy raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/peddy/peddy.py\", line 25, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'somalier' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule somalier raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/somalier/somalier.py\", line 29, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'methylQA' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule methylQA raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/methylQA/methylQA.py\", line 22, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'mosdepth' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule mosdepth raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/mosdepth/mosdepth.py\", line 74, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'phantompeakqualtools' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule phantompeakqualtools raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/phantompeakqualtools/phantompeakqualtools.py\", line 20, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'qualimap' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule qualimap raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/qualimap/qualimap.py\", line 24, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'preseq' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule preseq raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/preseq/preseq.py\", line 22, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'quast' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule quast raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/quast/quast.py\", line 22, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'qorts' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule qorts raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/qorts/qorts.py\", line 22, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'rna_seqc' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule rna_seqc raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/rna_seqc/rna_seqc.py\", line 21, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'rockhopper' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule rockhopper raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/rockhopper/rockhopper.py\", line 23, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'rsem' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule rsem raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/rsem/rsem.py\", line 25, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'rseqc' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule rseqc raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/rseqc/rseqc.py\", line 25, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'busco' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule busco raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/busco/busco.py\", line 22, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'goleft_indexcov' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule goleft_indexcov raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/goleft_indexcov/goleft_indexcov.py\", line 19, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'disambiguate' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule disambiguate raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/disambiguate/disambiguate.py\", line 16, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'supernova' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule supernova raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/supernova/supernova.py\", line 19, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'deeptools' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule deeptools raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/deeptools/deeptools.py\", line 38, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'sargasso' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule sargasso raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/sargasso/sargasso.py\", line 21, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'verifybamid' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule verifybamid raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/verifybamid/verifybamid.py\", line 23, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'mirtrace' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule mirtrace raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/mirtrace/mirtrace.py\", line 22, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'happy' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule happy raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/happy/happy.py\", line 32, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'mirtop' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule mirtop raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/mirtop/mirtop.py\", line 22, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'homer' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule homer raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/homer/homer.py\", line 26, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'hops' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule hops raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/hops/hops.py\", line 18, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'macs2' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule macs2 raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/macs2/macs2.py\", line 21, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'theta2' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule theta2 raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/theta2/theta2.py\", line 20, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'snpeff' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule snpeff raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/snpeff/snpeff.py\", line 21, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'gatk' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule gatk raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/gatk/gatk.py\", line 27, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'htseq' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule htseq raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/htseq/htseq.py\", line 21, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'bcftools' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule bcftools raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/bcftools/bcftools.py\", line 25, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'featureCounts' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule featureCounts raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/featureCounts/feature_counts.py\", line 22, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'fgbio' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule fgbio raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/fgbio/fgbio.py\", line 23, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'dragen' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule dragen raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/dragen/dragen.py\", line 42, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'dedup' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule dedup raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/dedup/dedup.py\", line 24, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'damageprofiler' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule damageprofiler raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/damageprofiler/damageprofiler.py\", line 24, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'biobambam2' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule biobambam2 raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/biobambam2/biobambam2.py\", line 22, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'jcvi' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule jcvi raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/jcvi/jcvi.py\", line 21, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'mtnucratio' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule mtnucratio raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/mtnucratio/mtnucratio.py\", line 23, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'picard' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule picard raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/picard/picard.py\", line 44, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'sentieon' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule sentieon raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/sentieon/sentieon.py\", line 30, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'prokka' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule prokka raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/prokka/prokka.py\", line 21, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'qc3C' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule qc3C raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/qc3C/qc3C.py\", line 137, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'samblaster' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule samblaster raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/samblaster/samblaster.py\", line 22, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'samtools' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule samtools raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/samtools/samtools.py\", line 28, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'sexdeterrmine' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule sexdeterrmine raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/sexdeterrmine/sexdeterrmine.py\", line 23, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'eigenstratdatabasetools' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule eigenstratdatabasetools raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/eigenstratdatabasetools/eigenstratdatabasetools.py\", line 23, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'bamtools' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule bamtools raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/bamtools/bamtools.py\", line 26, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'jellyfish' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule jellyfish raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/jellyfish/jellyfish.py\", line 18, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'vcftools' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule vcftools raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/vcftools/vcftools.py\", line 19, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'longranger' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule longranger raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/longranger/longranger.py\", line 24, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'stacks' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule stacks raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/stacks/stacks.py\", line 19, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'varscan2' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule varscan2 raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/varscan2/varscan2.py\", line 23, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'bbmap' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule bbmap raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/bbmap/bbmap.py\", line 23, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'bismark' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule bismark raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/bismark/bismark.py\", line 67, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'biscuit' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule biscuit raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/biscuit/biscuit.py\", line 30, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'hicexplorer' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule hicexplorer raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/hicexplorer/hicexplorer.py\", line 17, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'hicup' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule hicup raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/hicup/hicup.py\", line 22, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'hicpro' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule hicpro raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/hicpro/hicpro.py\", line 26, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'salmon' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule salmon raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/salmon/salmon.py\", line 22, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'kallisto' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule kallisto raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/kallisto/kallisto.py\", line 25, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'slamdunk' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule slamdunk raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/slamdunk/slamdunk.py\", line 26, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'star' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule star raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/star/star.py\", line 23, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'hisat2' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule hisat2 raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/hisat2/hisat2.py\", line 23, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'tophat' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule tophat raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/tophat/tophat.py\", line 23, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'bowtie2' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule bowtie2 raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/bowtie2/bowtie2.py\", line 23, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'bowtie1' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule bowtie1 raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/bowtie1/bowtie1.py\", line 24, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'snpsplit' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule snpsplit raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/snpsplit/snpsplit.py\", line 18, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'kat' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule kat raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/kat/kat.py\", line 18, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'leehom' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule leehom raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/leehom/leehom.py\", line 24, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'adapterRemoval' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule adapterRemoval raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/adapterRemoval/adapterRemoval.py\", line 21, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'clipandmerge' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule clipandmerge raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/clipandmerge/clipandmerge.py\", line 24, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'cutadapt' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule cutadapt raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/cutadapt/cutadapt.py\", line 28, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'flexbar' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule flexbar raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/flexbar/flexbar.py\", line 21, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'kaiju' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule kaiju raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/kaiju/kaiju.py\", line 20, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'kraken' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule kraken raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/kraken/kraken.py\", line 25, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'malt' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule malt raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/malt/malt.py\", line 20, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'trimmomatic' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule trimmomatic raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/trimmomatic/trimmomatic.py\", line 22, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'sickle' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule sickle raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/sickle/sickle.py\", line 17, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'skewer' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule skewer raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/skewer/skewer.py\", line 22, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'sortmerna' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule sortmerna raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/sortmerna/sortmerna.py\", line 23, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'biobloomtools' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule biobloomtools raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/biobloomtools/biobloomtools.py\", line 20, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'fastq_screen' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule fastq_screen raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/fastq_screen/fastq_screen.py\", line 24, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'afterqc' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule afterqc raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/afterqc/afterqc.py\", line 26, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'fastp' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule fastp raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/fastp/fastp.py\", line 26, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'fastqc' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule fastqc raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/fastqc/fastqc.py\", line 36, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'pychopper' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule pychopper raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/pychopper/pychopper.py\", line 21, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'pycoqc' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule pycoqc raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/pycoqc/pycoqc.py\", line 19, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'minionqc' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule minionqc raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/minionqc/minionqc.py\", line 22, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'multivcfanalyzer' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule multivcfanalyzer raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/multivcfanalyzer/multivcfanalyzer.py\", line 25, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'clusterflow' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule clusterflow raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/clusterflow/clusterflow.py\", line 28, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'bcl2fastq' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule bcl2fastq raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/bcl2fastq/bcl2fastq.py\", line 18, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'interop' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule interop raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/interop/interop.py\", line 14, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'ivar' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule ivar raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/ivar/ivar.py\", line 23, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'flash' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule flash raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/flash/flash.py\", line 26, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'seqyclean' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule seqyclean raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/seqyclean/seqyclean.py\", line 18, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[ERROR ]\u001b[0m multiqc : \u001b[31mOops! The 'optitype' MultiQC module broke... \n Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues \n If possible, please include a log file that triggers the error - the last file found was:\n None\n============================================================\nModule optitype raised an exception: Traceback (most recent call last):\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/multiqc.py\", line 594, in run\n output = mod()\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/optitype/optitype.py\", line 24, in __init__\n super(MultiqcModule, self).__init__(\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/modules/base_module.py\", line 45, in __init__\n config.update({anchor: mod_cust_config.get(\"custom_config\", {})})\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 250, in update\n return update_dict(globals(), u)\n File \"/home/azureuser/mambaforge/lib/python3.10/site-packages/multiqc/utils/config.py\", line 256, in update_dict\n if isinstance(val, collections.Mapping):\nAttributeError: module 'collections' has no attribute 'Mapping'\n============================================================\u001b[0m\n\u001b[1;30m[WARNING]\u001b[0m multiqc : \u001b[33mNo analysis results found. Cleaning up..\u001b[0m\n\u001b[1;30m[INFO ]\u001b[0m multiqc : MultiQC complete\n" - } - ], - "execution_count": 25, + "execution_count": null, "metadata": { "gather": { "logged": 1682517201690 } - } + }, + "outputs": [], + "source": [ + "! multiqc -f data/fastqc -f\n", + "#! mv multiqc_data/ data/" + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "### STEP 8: Index the Transcriptome so that Trimmed Reads Can Be Mapped Using Salmon" - ], - "metadata": {} + "### Index the Transcriptome so that Trimmed Reads Can Be Mapped Using Salmon" + ] }, { "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "! salmon index -t data/reference/M_chelonae_transcripts.fasta -p 8 -i data/reference/transcriptome_index --decoys data/reference/decoys.txt -k 31 --keepDuplicates" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": "Version Info: ### PLEASE UPGRADE SALMON ###\n### A newer version of salmon with important bug fixes and improvements is available. ####\n###\nThe newest version, available at https://github.com/COMBINE-lab/salmon/releases\ncontains new features, improvements, and bug fixes; please upgrade at your\nearliest convenience.\n###\nSign up for the salmon mailing list to hear about new versions, features and updates at:\nhttps://oceangenomics.com/subscribe\n###index [\"data/reference/transcriptome_index\"] did not previously exist . . . creating it\n[2023-04-26 13:54:40.001] [jLog] [info] building index\nout : data/reference/transcriptome_index\n\u001b[00m[2023-04-26 13:54:40.023] [puff::index::jointLog] [info] Running fixFasta\n\u001b[00m\n[Step 1 of 4] : counting k-mers\n\n\u001b[35m[2023-04-26 13:54:40.424] [puff::index::jointLog] [warning] There were 2 transcripts that would need to be removed to avoid duplicates.\n\u001b[00m\u001b[00m[2023-04-26 13:54:40.454] [puff::index::jointLog] [info] Replaced 0 non-ATCG nucleotides\n\u001b[00m\u001b[00m[2023-04-26 13:54:40.454] [puff::index::jointLog] [info] Clipped poly-A tails from 0 transcripts\n\u001b[00mwrote 4868 cleaned references\n\u001b[00m[2023-04-26 13:54:40.706] [puff::index::jointLog] [info] Filter size not provided; estimating from number of distinct k-mers\n\u001b[00m\u001b[00m[2023-04-26 13:54:40.820] [puff::index::jointLog] [info] ntHll estimated 4966944 distinct k-mers, setting filter size to 2^27\n\u001b[00mThreads = 8\nVertex length = 31\nHash functions = 5\nFilter size = 134217728\nCapacity = 2\nFiles: \ndata/reference/transcriptome_index/ref_k31_fixed.fa\n--------------------------------------------------------------------------------\nRound 0, 0:134217728\nPass\tFilling\tFiltering\n1\t1\t3\t\n2\t0\t0\nTrue junctions count = 10131\nFalse junctions count = 7124\nHash table size = 17255\nCandidate marks count = 34884\n--------------------------------------------------------------------------------\nReallocating bifurcations time: 0\nTrue marks count: 21319\nEdges construction time: 22\n--------------------------------------------------------------------------------\nDistinct junctions = 10131\n\nallowedIn: 13\nMax Junction ID: 10174\nseen.size():81401 kmerInfo.size():10175\napproximateContigTotalLength: 4947048\ncounters for complex kmers:\n(prec>1 & succ>1)=8 | (succ>1 & isStart)=0 | (prec>1 & isEnd)=0 | (isStart & isEnd)=2\ncontig count: 10353 element count: 5328123 complex nodes: 10\n# of ones in rank vector: 10352\n\u001b[00m[2023-04-26 13:55:08.167] [puff::index::jointLog] [info] Starting the Pufferfish indexing by reading the GFA binary file.\n\u001b[00m\u001b[00m[2023-04-26 13:55:08.167] [puff::index::jointLog] [info] Setting the index/BinaryGfa directory data/reference/transcriptome_index\n\u001b[00msize = 5328123\n-----------------------------------------\n| Loading contigs | Time = 35.212 ms\n-----------------------------------------\nsize = 5328123\n-----------------------------------------\n| Loading contig boundaries | Time = 24.882 ms\n-----------------------------------------\nNumber of ones: 10352\nNumber of ones per inventory item: 512\nInventory entries filled: 21\n10352\n\u001b[00m[2023-04-26 13:55:08.237] [puff::index::jointLog] [info] Done wrapping the rank vector with a rank9sel structure.\n\u001b[00m\u001b[00m[2023-04-26 13:55:08.237] [puff::index::jointLog] [info] contig count for validation: 10,352\n\u001b[00m\u001b[00m[2023-04-26 13:55:08.261] [puff::index::jointLog] [info] Total # of Contigs : 10,352\n\u001b[00m\u001b[00m[2023-04-26 13:55:08.261] [puff::index::jointLog] [info] Total # of numerical Contigs : 10,352\n\u001b[00m\u001b[00m[2023-04-26 13:55:08.261] [puff::index::jointLog] [info] Total # of contig vec entries: 16,484\n\u001b[00m\u001b[00m[2023-04-26 13:55:08.261] [puff::index::jointLog] [info] bits per offset entry 15\n\u001b[00m\u001b[00m[2023-04-26 13:55:08.261] [puff::index::jointLog] [info] Done constructing the contig vector. 10353\n\u001b[00m\u001b[00m[2023-04-26 13:55:08.389] [puff::index::jointLog] [info] # segments = 10,352\n\u001b[00m\u001b[00m[2023-04-26 13:55:08.389] [puff::index::jointLog] [info] total length = 5,328,123\n\u001b[00m\u001b[00m[2023-04-26 13:55:08.390] [puff::index::jointLog] [info] Reading the reference files ...\n\u001b[00m\u001b[00m[2023-04-26 13:55:08.579] [puff::index::jointLog] [info] positional integer width = 23\n\u001b[00m\u001b[00m[2023-04-26 13:55:08.579] [puff::index::jointLog] [info] seqSize = 5,328,123\n\u001b[00m\u001b[00m[2023-04-26 13:55:08.579] [puff::index::jointLog] [info] rankSize = 5,328,123\n\u001b[00m\u001b[00m[2023-04-26 13:55:08.579] [puff::index::jointLog] [info] edgeVecSize = 0\n\u001b[00m\u001b[00m[2023-04-26 13:55:08.579] [puff::index::jointLog] [info] num keys = 5,017,563\n\u001b[00mfor info, total work write each : 2.331 total work inram from level 3 : 4.322 total work raw : 25.000 \n[Building BooPHF] 100 % elapsed: 0 min 3 sec remaining: 0 min 0 sec\nBitarray 26296128 bits (100.00 %) (array + ranks )\nfinal hash 0 bits (0.00 %) (nb in final hash 0)\n\u001b[00m[2023-04-26 13:55:11.448] [puff::index::jointLog] [info] mphf size = 3.13474 MB\n\u001b[00m\u001b[00m[2023-04-26 13:55:11.453] [puff::index::jointLog] [info] chunk size = 666,016\n\u001b[00m\u001b[00m[2023-04-26 13:55:11.453] [puff::index::jointLog] [info] chunk 0 = [0, 666,016)\n\u001b[00m\u001b[00m[2023-04-26 13:55:11.453] [puff::index::jointLog] [info] chunk 1 = [666,016, 1,332,032)\n\u001b[00m\u001b[00m[2023-04-26 13:55:11.453] [puff::index::jointLog] [info] chunk 2 = [1,332,032, 1,998,048)\n\u001b[00m\u001b[00m[2023-04-26 13:55:11.453] [puff::index::jointLog] [info] chunk 3 = [1,998,048, 2,664,064)\n\u001b[00m\u001b[00m[2023-04-26 13:55:11.453] [puff::index::jointLog] [info] chunk 4 = [2,664,064, 3,330,080)\n\u001b[00m\u001b[00m[2023-04-26 13:55:11.453] [puff::index::jointLog] [info] chunk 5 = [3,330,080, 3,996,096)\n\u001b[00m\u001b[00m[2023-04-26 13:55:11.453] [puff::index::jointLog] [info] chunk 6 = [3,996,096, 4,662,112)\n\u001b[00m\u001b[00m[2023-04-26 13:55:11.453] [puff::index::jointLog] [info] chunk 7 = [4,662,112, 5,328,093)\n\u001b[00m\u001b[00m[2023-04-26 13:55:11.635] [puff::index::jointLog] [info] finished populating pos vector\n\u001b[00m\u001b[00m[2023-04-26 13:55:11.635] [puff::index::jointLog] [info] writing index components\n\u001b[00m\u001b[00m[2023-04-26 13:55:12.061] [puff::index::jointLog] [info] finished writing dense pufferfish index\n\u001b[00m[2023-04-26 13:55:12.184] [jLog] [info] done building index\n" - } - ], - "execution_count": 28, - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "### STEP 9: Run Salmon to Map Reads to Transcripts and Quantify Expression Levels\n", + "### Run Salmon to Map Reads to Transcripts and Quantify Expression Levels\n", "Salmon aligns the trimmed reads to the reference transcriptome and generates the read counts per transcript. In this analysis, each gene has a single transcript." - ], - "metadata": {} + ] }, { "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true, + "tags": [] + }, + "outputs": [], "source": [ "%%bash\n", "salmon quant -i data/reference/transcriptome_index -l SR -r data/trimmed/SRR13349122_1_trimmed.fastq -p 8 --validateMappings -o data/quants/SRR13349122_quant\n", "salmon quant -i data/reference/transcriptome_index -l SR -r data/trimmed/SRR13349128_1_trimmed.fastq -p 8 --validateMappings -o data/quants/SRR13349128_quant" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stderr", - "text": "Version Info: ### PLEASE UPGRADE SALMON ###\n### A newer version of salmon with important bug fixes and improvements is available. ####\n###\nThe newest version, available at https://github.com/COMBINE-lab/salmon/releases\ncontains new features, improvements, and bug fixes; please upgrade at your\nearliest convenience.\n###\nSign up for the salmon mailing list to hear about new versions, features and updates at:\nhttps://oceangenomics.com/subscribe\n###### salmon (selective-alignment-based) v1.5.1\n### [ program ] => salmon \n### [ command ] => quant \n### [ index ] => { data/reference/transcriptome_index }\n### [ libType ] => { SR }\n### [ unmatedReads ] => { data/trimmed/SRR13349122_1_trimmed.fastq }\n### [ threads ] => { 8 }\n### [ validateMappings ] => { }\n### [ output ] => { data/quants/SRR13349122_quant }\nLogs will be written to data/quants/SRR13349122_quant/logs\n[2023-04-26 14:00:23.857] [jointLog] [info] setting maxHashResizeThreads to 8\n[2023-04-26 14:00:23.857] [jointLog] [info] Fragment incompatibility prior below threshold. Incompatible fragments will be ignored.\n[2023-04-26 14:00:23.857] [jointLog] [info] Usage of --validateMappings implies use of minScoreFraction. Since not explicitly specified, it is being set to 0.65\n[2023-04-26 14:00:23.857] [jointLog] [info] Setting consensusSlack to selective-alignment default of 0.35.\n[2023-04-26 14:00:23.857] [jointLog] [info] parsing read library format\n[2023-04-26 14:00:23.857] [jointLog] [info] There is 1 library.\n[2023-04-26 14:00:24.001] [jointLog] [info] Loading pufferfish index\n[2023-04-26 14:00:24.029] [jointLog] [info] Loading dense pufferfish index.\n-----------------------------------------\n| Loading contig table | Time = 49.107 ms\n-----------------------------------------\nsize = 10353\n-----------------------------------------\n| Loading contig offsets | Time = 25.831 ms\n-----------------------------------------\n-----------------------------------------\n| Loading reference lengths | Time = 4.4126 ms\n-----------------------------------------\n-----------------------------------------\n| Loading mphf table | Time = 124.39 ms\n-----------------------------------------\nsize = 5328123\nNumber of ones: 10352\nNumber of ones per inventory item: 512\nInventory entries filled: 21\n-----------------------------------------\n| Loading contig boundaries | Time = 56.826 ms\n-----------------------------------------\nsize = 5328123\n-----------------------------------------\n| Loading sequence | Time = 86.304 ms\n-----------------------------------------\nsize = 5017563\n-----------------------------------------\n| Loading positions | Time = 283.15 ms\n-----------------------------------------\nsize = 9684800\n-----------------------------------------\n| Loading reference sequence | Time = 126.8 ms\n-----------------------------------------\n-----------------------------------------\n| Loading reference accumulative lengths | Time = 8.7525 ms\n-----------------------------------------\n\n\n\n\n\n\n\n\n\n\n\n\n[2023-04-26 14:00:24.865] [jointLog] [info] done\n[2023-04-26 14:00:24.865] [jointLog] [info] Index contained 4,868 targets\n[2023-04-26 14:00:24.867] [jointLog] [info] Number of decoys : 1\n[2023-04-26 14:00:24.867] [jointLog] [info] First decoy index : 4,867 \n[2023-04-26 14:00:25.119] [jointLog] [info] Thread saw mini-batch with a maximum of 0.28% zero probability fragments\n[2023-04-26 14:00:25.119] [jointLog] [info] Thread saw mini-batch with a maximum of 0.30% zero probability fragments\n[2023-04-26 14:00:25.120] [jointLog] [info] Thread saw mini-batch with a maximum of 0.10% zero probability fragments\n[2023-04-26 14:00:25.120] [jointLog] [info] Thread saw mini-batch with a maximum of 0.32% zero probability fragments\n[2023-04-26 14:00:25.126] [jointLog] [info] Thread saw mini-batch with a maximum of 0.41% zero probability fragments\n[2023-04-26 14:00:25.138] [jointLog] [info] Computed 1,145 rich equivalence classes for further processing\n[2023-04-26 14:00:25.138] [jointLog] [info] Counted 2,816 total reads in the equivalence classes \n[2023-04-26 14:00:25.141] [jointLog] [info] Number of mappings discarded because of alignment score : 191\n[2023-04-26 14:00:25.141] [jointLog] [info] Number of fragments entirely discarded because of alignment score : 262\n[2023-04-26 14:00:25.141] [jointLog] [info] Number of fragments discarded because they are best-mapped to decoys : 227\n[2023-04-26 14:00:25.141] [jointLog] [info] Number of fragments discarded because they have only dovetail (discordant) mappings to valid targets : 0\n[2023-04-26 14:00:25.142] [jointLog] [warning] Only 2816 fragments were mapped, but the number of burn-in fragments was set to 5000000.\nThe effective lengths have been computed using the observed mappings.\n\n[2023-04-26 14:00:25.142] [jointLog] [info] Mapping rate = 5.64906%\n\n[2023-04-26 14:00:25.142] [jointLog] [info] finished quantifyLibrary()\n[2023-04-26 14:00:25.197] [jointLog] [info] Starting optimizer\n[2023-04-26 14:00:25.201] [jointLog] [info] Marked 0 weighted equivalence classes as degenerate\n[2023-04-26 14:00:25.201] [jointLog] [info] iteration = 0 | max rel diff. = 0.989424\n[2023-04-26 14:00:25.254] [jointLog] [info] iteration = 100 | max rel diff. = 0\n[2023-04-26 14:00:25.255] [jointLog] [info] Finished optimizer\n[2023-04-26 14:00:25.255] [jointLog] [info] writing output \n\nVersion Info: ### PLEASE UPGRADE SALMON ###\n### A newer version of salmon with important bug fixes and improvements is available. ####\n###\nThe newest version, available at https://github.com/COMBINE-lab/salmon/releases\ncontains new features, improvements, and bug fixes; please upgrade at your\nearliest convenience.\n###\nSign up for the salmon mailing list to hear about new versions, features and updates at:\nhttps://oceangenomics.com/subscribe\n###### salmon (selective-alignment-based) v1.5.1\n### [ program ] => salmon \n### [ command ] => quant \n### [ index ] => { data/reference/transcriptome_index }\n### [ libType ] => { SR }\n### [ unmatedReads ] => { data/trimmed/SRR13349128_1_trimmed.fastq }\n### [ threads ] => { 8 }\n### [ validateMappings ] => { }\n### [ output ] => { data/quants/SRR13349128_quant }\nLogs will be written to data/quants/SRR13349128_quant/logs\n[2023-04-26 14:00:26.693] [jointLog] [info] setting maxHashResizeThreads to 8\n[2023-04-26 14:00:26.693] [jointLog] [info] Fragment incompatibility prior below threshold. Incompatible fragments will be ignored.\n[2023-04-26 14:00:26.693] [jointLog] [info] Usage of --validateMappings implies use of minScoreFraction. Since not explicitly specified, it is being set to 0.65\n[2023-04-26 14:00:26.693] [jointLog] [info] Setting consensusSlack to selective-alignment default of 0.35.\n[2023-04-26 14:00:26.693] [jointLog] [info] parsing read library format\n[2023-04-26 14:00:26.694] [jointLog] [info] There is 1 library.\n-----------------------------------------\n| Loading contig table | Time = 1.626 ms\n-----------------------------------------\nsize = 10353\n-----------------------------------------\n| Loading contig offsets | Time = 137.1 us\n-----------------------------------------\n-----------------------------------------\n| Loading reference lengths | Time = 18.9 us\n-----------------------------------------\n-----------------------------------------\n| Loading mphf table | Time = 1.0612 ms\n-----------------------------------------\nsize = 5328123\nNumber of ones: 10352\nNumber of ones per inventory item: 512\n[2023-04-26 14:00:26.761] [jointLog] [info] Loading pufferfish index\n[2023-04-26 14:00:26.761] [jointLog] [info] Loading dense pufferfish index.\nInventory entries filled: 21\n-----------------------------------------\n| Loading contig boundaries | Time = 9.9375 ms\n-----------------------------------------\nsize = 5328123\n-----------------------------------------\n| Loading sequence | Time = 1.0925 ms\n-----------------------------------------\nsize = 5017563\n-----------------------------------------\n| Loading positions | Time = 11.603 ms\n-----------------------------------------\nsize = 9684800\n-----------------------------------------\n| Loading reference sequence | Time = 2.755 ms\n-----------------------------------------\n-----------------------------------------\n| Loading reference accumulative lengths | Time = 34.3 us\n-----------------------------------------\n[2023-04-26 14:00:26.790] [jointLog] [info] done\n[2023-04-26 14:00:26.790] [jointLog] [info] Index contained 4,868 targets\n[2023-04-26 14:00:26.791] [jointLog] [info] Number of decoys : 1\n[2023-04-26 14:00:26.791] [jointLog] [info] First decoy index : 4,867 \n\n\n\n\n\n\n\n\n\n\n\n\n[2023-04-26 14:00:27.121] [jointLog] [info] Thread saw mini-batch with a maximum of 0.22% zero probability fragments\n[2023-04-26 14:00:27.121] [jointLog] [info] Thread saw mini-batch with a maximum of 0.18% zero probability fragments\n[2023-04-26 14:00:27.121] [jointLog] [info] Thread saw mini-batch with a maximum of 0.14% zero probability fragments\n[2023-04-26 14:00:27.121] [jointLog] [info] Thread saw mini-batch with a maximum of 0.26% zero probability fragments\n[2023-04-26 14:00:27.123] [jointLog] [info] Thread saw mini-batch with a maximum of 0.20% zero probability fragments\n[2023-04-26 14:00:27.128] [jointLog] [info] Thread saw mini-batch with a maximum of 0.14% zero probability fragments\n[2023-04-26 14:00:27.138] [jointLog] [info] Computed 850 rich equivalence classes for further processing\n[2023-04-26 14:00:27.138] [jointLog] [info] Counted 1,906 total reads in the equivalence classes \n[2023-04-26 14:00:27.142] [jointLog] [info] Number of mappings discarded because of alignment score : 81\n[2023-04-26 14:00:27.142] [jointLog] [info] Number of fragments entirely discarded because of alignment score : 160\n[2023-04-26 14:00:27.142] [jointLog] [info] Number of fragments discarded because they are best-mapped to decoys : 151\n[2023-04-26 14:00:27.142] [jointLog] [info] Number of fragments discarded because they have only dovetail (discordant) mappings to valid targets : 0\n[2023-04-26 14:00:27.142] [jointLog] [warning] Only 1906 fragments were mapped, but the number of burn-in fragments was set to 5000000.\nThe effective lengths have been computed using the observed mappings.\n\n[2023-04-26 14:00:27.142] [jointLog] [info] Mapping rate = 3.82493%\n\n[2023-04-26 14:00:27.142] [jointLog] [info] finished quantifyLibrary()\n[2023-04-26 14:00:27.182] [jointLog] [info] Starting optimizer\n[2023-04-26 14:00:27.187] [jointLog] [info] Marked 0 weighted equivalence classes as degenerate\n[2023-04-26 14:00:27.187] [jointLog] [info] iteration = 0 | max rel diff. = 0.996302\n[2023-04-26 14:00:27.234] [jointLog] [info] iteration = 100 | max rel diff. = 0\n[2023-04-26 14:00:27.235] [jointLog] [info] Finished optimizer\n[2023-04-26 14:00:27.235] [jointLog] [info] writing output \n\n" - } - ], - "execution_count": 40, - "metadata": { - "scrolled": true, - "tags": [] - } + ] }, { "cell_type": "code", - "source": [ - "ls data/quants/" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": "\u001b[0m\u001b[34;42mSRR13349122_quant\u001b[0m/ \u001b[34;42mSRR13349128_quant\u001b[0m/\r\n" - } - ], - "execution_count": 41, + "execution_count": null, "metadata": { + "gather": { + "logged": 1682518630201 + }, "jupyter": { - "source_hidden": false, - "outputs_hidden": false + "outputs_hidden": false, + "source_hidden": false }, "nteract": { "transient": { "deleting": false } - }, - "gather": { - "logged": 1682518630201 } - } + }, + "outputs": [], + "source": [ + "ls data/quants/" + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "### STEP 10: Report the top 10 most highly expressed genes in the samples" - ], - "metadata": {} + "### Report the top 10 most highly expressed genes in the samples" + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "Top 10 most highly expressed genes in the wild-type sample.\n" - ], - "metadata": {} + ] }, { "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "! sort -nrk 4,4 data/quants/SRR13349122_quant/quant.sf | head -10" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": "BB28_RS23830\t213\t10.625\t48612.291220\t5.000\r\nBB28_RS02220\t204\t9.377\t33047.563397\t3.000\r\nBB28_RS05530\t180\t6.996\t29531.286140\t2.000\r\nBB28_RS18945\t222\t12.150\t25504.663975\t3.000\r\nBB28_RS11370\t195\t8.348\t24748.475090\t2.000\r\nBB28_RS12480\t207\t9.766\t21154.305555\t2.000\r\nBB28_RS18745\t300\t51.326\t20125.718383\t10.000\r\nBB28_RS20695\t231\t14.032\t14723.212476\t2.000\r\nBB28_RS19155\t282\t36.744\t14056.208165\t5.000\r\nBB28_RS18020\t189\t7.759\t13312.711241\t1.000\r\nsort: write failed: 'standard output': Broken pipe\r\nsort: write error\r\n" - } - ], - "execution_count": 42, - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "Top 10 most highly expressed genes in the double lysogen sample.\n" - ], - "metadata": {} + ] }, { "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "!sort -nrk 4,4 data/quants/SRR13349128_quant/quant.sf | head -10" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": "BB28_RS18025\t177\t6.769\t47953.929601\t2.000\r\nBB28_RS02220\t204\t9.377\t34613.921846\t2.000\r\nBB28_RS13585\t243\t17.264\t28200.832626\t3.000\r\nBB28_RS01170\t225\t12.734\t25489.885138\t2.000\r\nBB28_RS20695\t231\t14.032\t23131.574929\t2.000\r\nBB28_RS19045\t183\t7.236\t22428.250651\t1.000\r\nBB28_RS04995\t192\t8.045\t20173.388438\t1.000\r\nBB28_RS14885\t195\t8.348\t19441.110656\t1.000\r\nBB28_RS18745\t300\t51.326\t18971.657043\t6.000\r\nBB28_RS23535\t201\t9.012\t18007.533576\t1.000\r\nsort: write failed: 'standard output': Broken pipe\r\nsort: write error\r\n" - } - ], - "execution_count": 43, - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "### STEP 11: Report the expression of a putative acyl-ACP desaturase (BB28_RS16545) that was downregulated in the double lysogen relative to wild-type\n", + "### Report the expression of a putative acyl-ACP desaturase (BB28_RS16545) that was downregulated in the double lysogen relative to wild-type\n", "A acyl-transferase was reported to be downregulated in the double lysogen as shown in the table of the top 20 upregulated and downregulated genes from the paper describing the study." - ], - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "Use `grep` to report the expression in the wild-type sample. The fields in the Salmon `quant.sf` file are as follows. The level of expression is reported in the Transcripts Per Million (`TPM`) and number of reads (`NumReads`) fields: \n", "`Name Length EffectiveLength TPM NumReads`" - ], - "metadata": {} + ] }, { "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "!grep 'BB28_RS16545' data/quants/SRR13349122_quant/quant.sf" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": "BB28_RS16545\t987\t737.000\t560.631139\t4.000\r\n" - } - ], - "execution_count": 44, - "metadata": {} + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ "Use `grep` to report the expression in the double lysogen sample. The fields in the Salmon `quant.sf` file are as follows. The level of expression is reported in the Transcripts Per Million (`TPM`) and number of reads (`NumReads`) fields: \n", "`Name Length EffectiveLength TPM NumReads`" - ], - "metadata": {} + ] }, { "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "!grep 'BB28_RS16545' data/quants/SRR13349128_quant/quant.sf" - ], - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": "BB28_RS16545\t987\t737.000\t220.201284\t1.000\r\n" - } - ], - "execution_count": 45, - "metadata": {} + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Conclusion\n", + "Here you learned how to import data to and from a Blob storage container and then use fastq files to run basic RNAseq analysis! " + ] }, { "cell_type": "markdown", + "metadata": {}, "source": [ - "### That's it! " - ], - "metadata": {} + "## Clean Up\n", + "Make sure you stop your compute instance and if desired, delete the resource group associated with this tutorial." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [] } ], "metadata": { + "kernel_info": { + "name": "python3" + }, "kernelspec": { - "name": "python3", + "display_name": "Python 3 (ipykernel)", "language": "python", - "display_name": "Python 3 (ipykernel)" + "name": "python3" }, "language_info": { - "name": "python", - "version": "3.8.13", - "mimetype": "text/x-python", "codemirror_mode": { "name": "ipython", "version": 3 }, - "pygments_lexer": "ipython3", + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", "nbconvert_exporter": "python", - "file_extension": ".py" + "pygments_lexer": "ipython3", + "version": "3.8.13" }, "microsoft": { "ms_spell_check": { "ms_spell_check_language": "en" } }, - "kernel_info": { - "name": "python3" - }, "nteract": { "version": "nteract-front-end@1.0.0" } }, "nbformat": 4, "nbformat_minor": 4 -} \ No newline at end of file +} From 39fb6c5ddafefb36d04d9691cea2b900d3092a05 Mon Sep 17 00:00:00 2001 From: zbyosufzai <145053952+zbyosufzai@users.noreply.github.com> Date: Fri, 8 Mar 2024 07:41:18 -0500 Subject: [PATCH 10/25] Update README.md fixed grammer --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index c5bb462..a7dee0e 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ NIH Cloud Lab’s goal is to make Cloud easy and accessible for you, so that you can spend less time on administrative tasks and focus more on research. -Use this repository to learn about how to use Azure by exploring the linked resources and walking through the tutorials. If you are a beginner, we suggest you begin the jumpstart section on the [Cloud Lab website](https://cloud.nih.gov/resources/cloudlab/) before returning here. +Use this repository to learn about how to use Azure by exploring the linked resources and walking through the tutorials. If you are a beginner, we suggest you start with the jumpstart section on the [Cloud Lab website](https://cloud.nih.gov/resources/cloudlab/) before returning here. --------------------------------- ## Overview of Page Contents @@ -24,10 +24,10 @@ Use this repository to learn about how to use Azure by exploring the linked reso ## **Artificial Intelligence** Machine learning is a subfield of artificial intelligence that focuses on the development of algorithms and models that enable computers to learn from and make predictions or decisions based on data, without being explicitly programmed. Artificial intelligence and machine learning algorithms are being applied to a variety of biomedical research questions, ranging from image classification to genomic variant calling. Azure offers AI services through Azure AI Studio and Azure Machine Learning. -See our suite of tutorials to learn more about [Gen AI on Azure](/notebooks/GenAI/) that highlight Azure products such as [Azure AI Studio](/notebooks/GenAI/Azure_AI_Studio_README.md), [Azure OpenAI](/notebooks/GenAI/Azure_Open_AI_README.md) and [Azure AI Search](/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb) and external tools like [Langchain](/notebooks/GenAI/notebooks/AzureOpenAI-langchain.ipynb). These notebooks walk you through how to deploy, train, and query models, as well as how to implement techniques like [Retrieval-Augmented Generation (RAG)](/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb). If you are interested in applying RAG to structured data like a csv file, we created one tutorials that walks you through how to index your csv using the [Azure UI](/docs/create_index_from_csv.md) and a query your database using a [notebook within Azure ML](/notebooks/GenAI/notebooks/llm_query_csv.ipynb), and [one tutorial that runs all the necessary steps directly from a notebook](/notebooks/GenAI/notebooks/azure_ai_search_structured.ipynb). +See our suite of tutorials to learn more about [Gen AI on Azure](/notebooks/GenAI/) that highlight Azure products such as [Azure AI Studio](/notebooks/GenAI/Azure_AI_Studio_README.md), [Azure OpenAI](/notebooks/GenAI/Azure_Open_AI_README.md) and [Azure AI Search](/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb) and external tools like [Langchain](/notebooks/GenAI/notebooks/AzureOpenAI-langchain.ipynb). These notebooks walk you through how to deploy, train, and query models, as well as how to implement techniques like [Retrieval-Augmented Generation (RAG)](/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb). If you are interested in configuring a model to work with structured data like csv or json files, we've created tutorials that walk you through how to index your csv using the [Azure UI](/docs/create_index_from_csv.md) and query your database using a [notebook within Azure ML](/notebooks/GenAI/notebooks/llm_query_csv.ipynb). We also have another [tutorial that runs all the necessary steps directly from a notebook](/notebooks/GenAI/notebooks/azure_ai_search_structured.ipynb). ## **Clinical Informatics with FHIR** - Azure Health Data Services is a set of services that enables you to store, process, and analyze medical data in Azure. These services are designed to help organizations quickly connect disparate health data sources and formats, such as structured, imaging, and device data, and normalize it to be persisted in the cloud. At its core, Azure Health Data Services possesses the ability to transform and ingest data into FHIR (Fast Healthcare Interoperability Resources) format. This allows you to transform health data from legacy formats, such as HL7v2 or CDA, or from high-frequency IoT data in device proprietary formats to FHIR. This makes it easier to connect data stored in Azure Health Data Services with services across the Azure ecosystem, like Azure Synapse Analytics, and Azure Machine Learning (Azure ML). +Azure Health Data Services is a set of services that enables you to store, process, and analyze medical data in Azure. These services are designed to help organizations quickly connect disparate health data sources and formats, such as structured, imaging, and device data, and normalize it to be persisted in the cloud. At its core, Azure Health Data Services possesses the ability to transform and ingest data into FHIR (Fast Healthcare Interoperability Resources) format. This allows you to transform health data from legacy formats, such as HL7v2 or CDA, or from high-frequency IoT data in device proprietary formats to FHIR. This makes it easier to connect data stored in Azure Health Data Services with services across the Azure ecosystem, like Azure Synapse Analytics, and Azure Machine Learning (Azure ML). Azure Health Data Services includes support for multiple health data standards for the exchange of structured data, and the ability to deploy multiple instances of different service types (FHIR, DICOM, and MedTech) that seamlessly work with one another. Services deployed within a workspace also share a compliance boundary and common configuration settings. The product scales automatically to meet the varying demands of your workloads, so you spend less time managing infrastructure and more time generating insights from health data. From 8e47d3e2c01d3cdf604fdd8442c7e8e266e12900 Mon Sep 17 00:00:00 2001 From: zbyosufzai <145053952+zbyosufzai@users.noreply.github.com> Date: Fri, 8 Mar 2024 07:46:26 -0500 Subject: [PATCH 11/25] Update GWAS_coat_color.ipynb minor grammer update --- notebooks/GWAS/GWAS_coat_color.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/notebooks/GWAS/GWAS_coat_color.ipynb b/notebooks/GWAS/GWAS_coat_color.ipynb index 0bcfbc1..fd6bf6d 100644 --- a/notebooks/GWAS/GWAS_coat_color.ipynb +++ b/notebooks/GWAS/GWAS_coat_color.ipynb @@ -47,7 +47,7 @@ "metadata": {}, "source": [ "### Download the data\n", - "use %%bash to denote a bash block. You can also use '!' to denote a single bash command within a Python notebook" + "Use %%bash to denote a bash block. You can also use '!' to denote a single bash command within a Python notebook" ] }, { From 90473e3a9ea07697ce2d7b9e35fd7c74ec19b523 Mon Sep 17 00:00:00 2001 From: zbyosufzai <145053952+zbyosufzai@users.noreply.github.com> Date: Fri, 8 Mar 2024 07:59:13 -0500 Subject: [PATCH 12/25] Update AzureAIStudio_index_structured_notebook.ipynb added in the need for access to Azure AI Search Service --- .../notebooks/AzureAIStudio_index_structured_notebook.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb b/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb index cd56dd8..66b7253 100644 --- a/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb +++ b/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb @@ -21,7 +21,7 @@ "metadata": {}, "source": [ "## Prerequisites\n", - "We assume you have access to Azure AI Studio and have already deployed an LLM." + "We assume you have access to Azure AI Studio and Azure AI Search Service and have already deployed an LLM." ] }, { From 08c03374f5a97731fb621620bcb2102c83680840 Mon Sep 17 00:00:00 2001 From: zbyosufzai <145053952+zbyosufzai@users.noreply.github.com> Date: Fri, 8 Mar 2024 08:01:48 -0500 Subject: [PATCH 13/25] Update AzureAIStudio_index_structured_with_console.ipynb fixed grammer --- .../notebooks/AzureAIStudio_index_structured_with_console.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb b/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb index 29ad25b..5ad4ee9 100644 --- a/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb +++ b/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb @@ -28,7 +28,7 @@ "metadata": {}, "source": [ "## Prerequisites\n", - "We assume you have access to Azure AI Studio and Azure AI Search Service. We assume you and have already deployed an LLM." + "We assume you have access to both Azure AI Studio and Azure AI Search Service, and have already deployed an LLM." ] }, { From e240ae7bcd7e6dcf5bce3af2cad20cbb78b00b7a Mon Sep 17 00:00:00 2001 From: zbyosufzai <145053952+zbyosufzai@users.noreply.github.com> Date: Fri, 8 Mar 2024 08:07:48 -0500 Subject: [PATCH 14/25] Update Azure_Pubmed_chatbot.ipynb added need for AI search in prereq --- notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb b/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb index 70363cd..2b61d96 100644 --- a/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb +++ b/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb @@ -13,7 +13,7 @@ "metadata": {}, "source": [ "## Overview\n", - "[PubMed](https://pubmed.ncbi.nlm.nih.gov/about/) supports the search and retrieval of biomedical and life sciences literature with the aim of improving health–both globally and personally. Here we create a chatbot that is grounded on PubMed data. Most Azure command line tools are already installed and it is recommended to use the **AzureML** kernel in your Jupyter notebook." + "[PubMed](https://pubmed.ncbi.nlm.nih.gov/about/) supports the search and retrieval of biomedical and life sciences literature with the aim of improving health both globally and personally. Here we create a chatbot that is grounded on PubMed data. Most Azure command line tools are already installed and it is recommended to use the **AzureML** kernel in your Jupyter notebook." ] }, { @@ -21,7 +21,7 @@ "metadata": {}, "source": [ "## Prerequisites\n", - "We assume you have access to Azure AI Studio and have already deployed an LLM." + "We assume you have access to both Azure AI Studio and Azure AI Search, and have already deployed an LLM." ] }, { From 53cb6128497f884490078bcd24e44f649a6f4996 Mon Sep 17 00:00:00 2001 From: zbyosufzai <145053952+zbyosufzai@users.noreply.github.com> Date: Fri, 8 Mar 2024 08:09:10 -0500 Subject: [PATCH 15/25] Update Azure_Pubmed_chatbot.ipynb --- notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb b/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb index 2b61d96..1de15b3 100644 --- a/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb +++ b/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb @@ -57,7 +57,7 @@ "id": "9dbd13e7-afc9-416b-94dc-418a93e14587", "metadata": {}, "source": [ - "In this tutorial we will be using Azure OpenAI which you can learn how to deploy [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=cli). This tutorial utilizes the model **gpt-35-turbo** version 0301 and the embeddings model **text-embedding-ada-002** version 2." + "In this tutorial we will be using Azure OpenAI which (if you havent already) you can learn how to deploy [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=cli). This tutorial utilizes the model **gpt-35-turbo** version 0301 and the embeddings model **text-embedding-ada-002** version 2." ] }, { From 79ca8677378d94e22f4664bb5e755acf1de9e4ed Mon Sep 17 00:00:00 2001 From: zbyosufzai <145053952+zbyosufzai@users.noreply.github.com> Date: Fri, 8 Mar 2024 08:12:27 -0500 Subject: [PATCH 16/25] Update Azure_Pubmed_chatbot.ipynb fixed imports --- notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb | 3 +++ 1 file changed, 3 insertions(+) diff --git a/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb b/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb index 1de15b3..fe95398 100644 --- a/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb +++ b/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb @@ -666,6 +666,9 @@ " SearchIndexerDataSourceConnection,\n", " SearchIndex,\n", " SearchIndexer\n", + " SearchableField\n", + " SearchFieldDataType\n", + " SimpleField\n" ")\n", "\n", "endpoint = \"https://{}.search.windows.net/\".format(service_name)\n", From 2ec8a9e1dae0d1f3fafc861f91894d4c82531ed2 Mon Sep 17 00:00:00 2001 From: zbyosufzai <145053952+zbyosufzai@users.noreply.github.com> Date: Fri, 8 Mar 2024 08:14:48 -0500 Subject: [PATCH 17/25] Update Azure_Pubmed_chatbot.ipynb --- notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb b/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb index fe95398..ebea506 100644 --- a/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb +++ b/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb @@ -665,10 +665,10 @@ " SearchIndexerDataContainer,\n", " SearchIndexerDataSourceConnection,\n", " SearchIndex,\n", - " SearchIndexer\n", - " SearchableField\n", - " SearchFieldDataType\n", - " SimpleField\n" + " SearchIndexer,\n", + " SearchableField,\n", + " SearchFieldDataType,\n", + " SimpleField,\n", ")\n", "\n", "endpoint = \"https://{}.search.windows.net/\".format(service_name)\n", From e2f7d4bf2a5d35cf5491e06e0a827a2ad0347c2a Mon Sep 17 00:00:00 2001 From: zbyosufzai <145053952+zbyosufzai@users.noreply.github.com> Date: Fri, 8 Mar 2024 08:24:14 -0500 Subject: [PATCH 18/25] Update pangolin_pipeline.ipynb removed gray from text --- notebooks/pangolin/pangolin_pipeline.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/notebooks/pangolin/pangolin_pipeline.ipynb b/notebooks/pangolin/pangolin_pipeline.ipynb index 3603c9a..453aa95 100644 --- a/notebooks/pangolin/pangolin_pipeline.ipynb +++ b/notebooks/pangolin/pangolin_pipeline.ipynb @@ -5,7 +5,7 @@ "id": "31e8c3cd", "metadata": {}, "source": [ - "# Pangolin SARS-CoV-2 Pipeline Notebook" + "# Pangolin SARS-CoV-2 Pipeline Notebook" ] }, { From 7284fbf8e170e695698bc3170f76b9e539857f4d Mon Sep 17 00:00:00 2001 From: zbyosufzai <145053952+zbyosufzai@users.noreply.github.com> Date: Fri, 8 Mar 2024 08:49:56 -0500 Subject: [PATCH 19/25] Update SpleenSeg_Pretrained-4_27.ipynb fixed formatting by removing h4 format --- .../SpleenSeg_Pretrained-4_27.ipynb | 52 ++++++++----------- 1 file changed, 23 insertions(+), 29 deletions(-) diff --git a/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb b/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb index e6b6117..39b5947 100644 --- a/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb +++ b/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb @@ -51,7 +51,7 @@ "id": "f59ba435", "metadata": {}, "source": [ - "##### Uncomment below to install all dependencies" + "Uncomment below to install all dependencies." ] }, { @@ -156,7 +156,7 @@ "id": "6f523cbf", "metadata": {}, "source": [ - "#### Running a pretrained model" + "### Running a pretrained model" ] }, { @@ -174,7 +174,7 @@ "id": "e9f3e5f3", "metadata": {}, "source": [ - "#### Create the directory for storing data" + "Create the directory for storing data" ] }, { @@ -194,7 +194,7 @@ "id": "38463a18", "metadata": {}, "source": [ - "#### Download the public dataset" + "### Download the public dataset" ] }, { @@ -217,7 +217,7 @@ "id": "fae7c51b", "metadata": {}, "source": [ - "#### Create Date Dictionaries and separate files from training and validation" + "### Create Date Dictionaries and separate files from training and validation" ] }, { @@ -243,7 +243,7 @@ "id": "974fc5aa", "metadata": {}, "source": [ - "#### Define your transformations for training and validation" + "### Define your transformations for training and validation" ] }, { @@ -327,7 +327,7 @@ "id": "ba3c7695", "metadata": {}, "source": [ - "#### Visualize Image and Label (example)" + "### Visualize Image and Label (example)" ] }, { @@ -358,9 +358,9 @@ "id": "f45ba707", "metadata": {}, "source": [ - "#### Use a dataloader to load files\n", - " - Ability to use LMDB (Lightning Memory-Mapped Database)\n", - " - Here is where transforms take place and they happen on both images and labels" + "### Use a dataloader to load files\n", + "Ability to use LMDB (Lightning Memory-Mapped Database)\n", + " - Here is where transforms take place and they happen on both images and labels" ] }, { @@ -400,7 +400,7 @@ "id": "a77e7856", "metadata": {}, "source": [ - "#### Now we want to download the pretrained model from NVIDIA" + "### Download the pretrained model from NVIDIA" ] }, { @@ -476,8 +476,7 @@ "id": "39910557", "metadata": {}, "source": [ - "### This will be our test file we will view for reference\n", - " - Here we see how our initial model appears to perform" + "This will be our test file we will view for reference. Here we see how our initial model appears to perform." ] }, { @@ -496,7 +495,7 @@ "id": "2544a774", "metadata": {}, "source": [ - "#### We use a sliding window technique to search the image" + "We use a sliding window technique to search the image." ] }, { @@ -566,8 +565,7 @@ "id": "2f60e5b5", "metadata": {}, "source": [ - "#### Using just the pretrained model, it appears we are performing pretty well\n", - " - We can now continue to train with our data using the NVIDIA models initial weights" + "Using just the pretrained model, it appears we are performing pretty well! We can now continue to train with our data using the NVIDIA models initial weights" ] }, { @@ -576,8 +574,7 @@ "metadata": {}, "source": [ "## Training\n", - "#### Without a GPU, training can take a while\n", - "#### Recommend skipping next three cells and load in model" + " Without a GPU, training can take a while, we recommend skipping next three cells and load in model.", ] }, { @@ -700,7 +697,7 @@ "id": "4ff0035d", "metadata": {}, "source": [ - "#### The model shows that it has improved fairly quickly over just 25 epochs" + "The model shows that it has improved fairly quickly over just 25 epochs." ] }, { @@ -709,7 +706,7 @@ "metadata": {}, "source": [ "## Inference\n", - "#### Without GPU skip to here to load previously trained best model (without a gpu the training will take a while)" + "Without GPU skip to here to load previously trained best model (without a gpu the training will take a while)." ] }, { @@ -727,7 +724,7 @@ "id": "fab5b4b9", "metadata": {}, "source": [ - "#### With the model loaded let's see if much has changed for our example image" + "With the model loaded let's see if much has changed for our example image." ] }, { @@ -798,7 +795,7 @@ "id": "6606bce2", "metadata": {}, "source": [ - "#### We see not much has changed, which is a good sign for how well the NVIDIA model performs out of the box." + "We see not much has changed, which is a good sign for how well the NVIDIA model performs out of the box." ] }, { @@ -806,7 +803,7 @@ "id": "5cfd20c6", "metadata": {}, "source": [ - "#### Here is the final image of our Spleen" + "Here is the final image of our Spleen!" ] }, { @@ -827,7 +824,7 @@ "id": "6030d210", "metadata": {}, "source": [ - "#### Feel free to play around in this notebook or download it and use it where a GPU is accessible" + "Feel free to play around in this notebook or download it and use it where a GPU is accessible." ] }, { @@ -836,8 +833,7 @@ "metadata": {}, "source": [ "## Additional Exercise: Use liver segmentation in addition to spleen\n", - " - Just need to load liver segmentation from NVIDIA\n", - " - While we can't train this model, since we don't have training data, we can use it as a rough estimate" + "Her we are loading in liver segmentation from NVIDIA. While we can't train this model, since we don't have training data, we can use it as a rough estimate." ] }, { @@ -973,9 +969,7 @@ "id": "af1169b6", "metadata": {}, "source": [ - "#### Continue including more models found at the NGC Catalog: \n", - "#### https://catalog.ngc.nvidia.com/models\n", - "##### - Recommend filtering by 'CT' " + "Continue including more models found at the NGC Catalog: https://catalog.ngc.nvidia.com/models. We recommend filtering by 'CT'.", ] }, { From 2386f018ae35907cd3694a2814e3d01eb9a136c3 Mon Sep 17 00:00:00 2001 From: zbyosufzai <145053952+zbyosufzai@users.noreply.github.com> Date: Fri, 8 Mar 2024 08:51:20 -0500 Subject: [PATCH 20/25] Update SpleenSeg_Pretrained-4_27.ipynb --- .../SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb b/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb index 39b5947..88c47f7 100644 --- a/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb +++ b/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb @@ -969,7 +969,7 @@ "id": "af1169b6", "metadata": {}, "source": [ - "Continue including more models found at the NGC Catalog: https://catalog.ngc.nvidia.com/models. We recommend filtering by 'CT'.", + "Continue including more models found at the NGC Catalog: https://catalog.ngc.nvidia.com/models. We recommend filtering by 'CT'." ] }, { From c7b9b4d99506a8877220f79da53dedc21b225226 Mon Sep 17 00:00:00 2001 From: zbyosufzai <145053952+zbyosufzai@users.noreply.github.com> Date: Fri, 8 Mar 2024 09:00:07 -0500 Subject: [PATCH 21/25] Update SpleenSeg_Pretrained-4_27.ipynb --- .../SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb b/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb index 88c47f7..cf8b3fe 100644 --- a/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb +++ b/notebooks/SpleenLiverSegmentation/SpleenSeg_Pretrained-4_27.ipynb @@ -359,8 +359,7 @@ "metadata": {}, "source": [ "### Use a dataloader to load files\n", - "Ability to use LMDB (Lightning Memory-Mapped Database)\n", - " - Here is where transforms take place and they happen on both images and labels" + "Ability to use LMDB (Lightning Memory-Mapped Database). Here is where transforms take place and they happen on both images and labels." ] }, { @@ -574,7 +573,7 @@ "metadata": {}, "source": [ "## Training\n", - " Without a GPU, training can take a while, we recommend skipping next three cells and load in model.", + " Without a GPU, training can take a while, we recommend skipping next three cells and load in model." ] }, { From b06e1c3d5f8c6d325e3498b692c5ac4bd9438033 Mon Sep 17 00:00:00 2001 From: zbyosufzai <145053952+zbyosufzai@users.noreply.github.com> Date: Fri, 8 Mar 2024 10:14:34 -0500 Subject: [PATCH 22/25] fixed links Azure_Open_AI_README.md --- notebooks/GenAI/Azure_Open_AI_README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/notebooks/GenAI/Azure_Open_AI_README.md b/notebooks/GenAI/Azure_Open_AI_README.md index 427234c..5f7ae1c 100644 --- a/notebooks/GenAI/Azure_Open_AI_README.md +++ b/notebooks/GenAI/Azure_Open_AI_README.md @@ -8,11 +8,11 @@ Welcome to this repository, a comprehensive collection of examples that will hel - 4 Python scripts that demonstrate how to use Azure OpenAI Embeddings to create embedding applications. - 42 in-depth content slides on the information covered in this workshop. Please find ```aoai_workshop_content.pdf``` in [search_documents](https://github.com/t-cjackson/Azure-OpenAI-Workshop/tree/main/search_documents) folder in this repository. -The purpose of this workshop is to equip participants with the necessary skills to make the most out of the Azure OpenAI Playground, Prompt Engineering, and Azure OpenAI Embeddings in Python. You can view in-depth info on these topics in the [workshop slides](/tutorials/notebooks/GenAI/search_documents/aoai_workshop_content.pdf). +The purpose of this workshop is to equip participants with the necessary skills to make the most out of the Azure OpenAI Playground, Prompt Engineering, and Azure OpenAI Embeddings in Python. You can view in-depth info on these topics in the [workshop slides](/notebooks/GenAI/search_documents/aoai_workshop_content.pdf). You can also learn a lot about the details of using Azure OpenAI at this [site](https://learn.microsoft.com/en-us/azure/ai-services/openai/use-your-data-quickstart?tabs=command-line&pivots=programming-language-studio). -We recommend you 1) go through the steps in this README, 2) complete the general notebook called `notebooks/AzureOpenAI_embeddings.ipynb`, then 3) explore the other notebooks at [this directory](/tutorials/notebooks/GenAI/notebooks) +We recommend you 1) go through the steps in this README, 2) complete the general notebook called `notebooks/AzureOpenAI_embeddings.ipynb`, then 3) explore the other notebooks at [this directory](/notebooks/GenAI/notebooks) ## Overview of Page Contents + [Azure OpenAI Playground Prerequisites](#Azure-OpenAI-Playground-Prerequisites) From cc1f1d5c7c940ea887ad66789d26e6c4807ecc01 Mon Sep 17 00:00:00 2001 From: zbyosufzai <145053952+zbyosufzai@users.noreply.github.com> Date: Fri, 8 Mar 2024 10:23:49 -0500 Subject: [PATCH 23/25] fixed links Azure_AI_Studio_README.md --- notebooks/GenAI/Azure_AI_Studio_README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/notebooks/GenAI/Azure_AI_Studio_README.md b/notebooks/GenAI/Azure_AI_Studio_README.md index 5a0fbbd..8c5573b 100644 --- a/notebooks/GenAI/Azure_AI_Studio_README.md +++ b/notebooks/GenAI/Azure_AI_Studio_README.md @@ -5,11 +5,11 @@ Microsoft Azure migrated the AI front end from Azure OpenAI to Azure AI Studio. Welcome to this repository, a comprehensive collection of examples that will help you chat with your data using the Azure OpenAI Studio Playground, create highly efficient large language model prompts, and build Azure OpenAI embeddings. -The purpose of this workshop is to equip participants with the necessary skills to make the most out of the Azure OpenAI Playground, Prompt Engineering, and Azure OpenAI Embeddings in Python. You can view in-depth info on these topics in the [workshop slides](/tutorials/notebooks/GenAI/search_documents/aoai_workshop_content.pdf). +The purpose of this workshop is to equip participants with the necessary skills to make the most out of the Azure OpenAI Playground, Prompt Engineering, and Azure OpenAI Embeddings in Python. You can view in-depth info on these topics in the [workshop slides](/notebooks/GenAI/search_documents/aoai_workshop_content.pdf). You can also learn a lot about the details of using Azure AI at this [site](https://azure.microsoft.com/en-us/products/ai-studio). -We recommend you 1) go through the steps in this README, 2) complete the general notebook called `notebooks/AzureOpenAI_embeddings.ipynb`, then 3) explore the other notebooks at [this directory](/tutorials/notebooks/GenAI/notebooks) +We recommend you 1) go through the steps in this README, 2) complete the general notebook called `notebooks/AzureOpenAI_embeddings.ipynb`, then 3) explore the other notebooks at [this directory](/notebooks/GenAI/notebooks) ## Overview of Page Contents + [Azure AI Playground Prerequisites](#Azure-OpenAI-Playground-Prerequisites) @@ -89,7 +89,7 @@ On the far right under *Configuration*, you can modify which model you are deplo ![modify deployment](/docs/images/19_deployment.png) -Finally, you can select the `parameters` tab to modify the model parameters. Review [this presentation](/tutorials/notebooks/GenAI/search_documents/aoai_workshop_content.pdf) to learn more about the parameters. +Finally, you can select the `parameters` tab to modify the model parameters. Review [this presentation](/notebooks/GenAI/search_documents/aoai_workshop_content.pdf) to learn more about the parameters. ![modify parameters](/docs/images/20_parameters.png) @@ -396,7 +396,7 @@ Creating embeddings of search documents allows you to use vector search, which i ### Environment Setup Navigate to your [Azure Machine Learning Studio environment](https://github.com/STRIDES/NIHCloudLabAzure#launch-a-machine-learning-workspace-jupyter-environment-). If you have not created your environment, [create one now](https://learn.microsoft.com/en-us/azure/machine-learning/tutorial-cloud-workstation?view=azureml-api-2). -Navigate to `Notebooks`, then clone this Git repo into your environment and navigate to the notebook called [AzureOpenAI_embeddings.ipynb](/tutorials/notebooks/GenAI/notebooks/AzureOpenAI_embeddings.ipynb). +Navigate to `Notebooks`, then clone this Git repo into your environment and navigate to the notebook called [AzureOpenAI_embeddings.ipynb](/notebooks/GenAI/notebooks/AzureOpenAI_embeddings.ipynb). You will need a variety of parameters to authenticate with the API. You can find these within the Playground by clicking **View Code**. Input these parameters into the notebook cell when asked. From fb7c4af71b574f98c808d4ab234d205f5be6bdb3 Mon Sep 17 00:00:00 2001 From: zbyosufzai <145053952+zbyosufzai@users.noreply.github.com> Date: Fri, 8 Mar 2024 11:20:41 -0500 Subject: [PATCH 24/25] fixed links README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index a7dee0e..a1caeb9 100644 --- a/README.md +++ b/README.md @@ -24,7 +24,7 @@ Use this repository to learn about how to use Azure by exploring the linked reso ## **Artificial Intelligence** Machine learning is a subfield of artificial intelligence that focuses on the development of algorithms and models that enable computers to learn from and make predictions or decisions based on data, without being explicitly programmed. Artificial intelligence and machine learning algorithms are being applied to a variety of biomedical research questions, ranging from image classification to genomic variant calling. Azure offers AI services through Azure AI Studio and Azure Machine Learning. -See our suite of tutorials to learn more about [Gen AI on Azure](/notebooks/GenAI/) that highlight Azure products such as [Azure AI Studio](/notebooks/GenAI/Azure_AI_Studio_README.md), [Azure OpenAI](/notebooks/GenAI/Azure_Open_AI_README.md) and [Azure AI Search](/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb) and external tools like [Langchain](/notebooks/GenAI/notebooks/AzureOpenAI-langchain.ipynb). These notebooks walk you through how to deploy, train, and query models, as well as how to implement techniques like [Retrieval-Augmented Generation (RAG)](/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb). If you are interested in configuring a model to work with structured data like csv or json files, we've created tutorials that walk you through how to index your csv using the [Azure UI](/docs/create_index_from_csv.md) and query your database using a [notebook within Azure ML](/notebooks/GenAI/notebooks/llm_query_csv.ipynb). We also have another [tutorial that runs all the necessary steps directly from a notebook](/notebooks/GenAI/notebooks/azure_ai_search_structured.ipynb). +See our suite of tutorials to learn more about [Gen AI on Azure](/notebooks/GenAI/) that highlight Azure products such as [Azure AI Studio](/notebooks/GenAI/Azure_AI_Studio_README.md), [Azure OpenAI](/notebooks/GenAI/Azure_Open_AI_README.md) and [Azure AI Search](/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb) and external tools like [Langchain](/notebooks/GenAI/notebooks/AzureAIStudio_langchain.ipynb). These notebooks walk you through how to deploy, train, and query models, as well as how to implement techniques like [Retrieval-Augmented Generation (RAG)](/notebooks/GenAI/notebooks/Azure_Pubmed_chatbot.ipynb). If you are interested in configuring a model to work with structured data like csv or json files, we've created tutorials that walk you through how to index your csv using the [Azure UI](/docs/create_index_from_csv.md) and query your database using a [notebook within Azure ML](/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb). We also have another [tutorial that runs all the necessary steps directly from a notebook](/notebooks/GenAI/notebooks/AzureAIStudio_index_structured_notebook.ipynb). ## **Clinical Informatics with FHIR** Azure Health Data Services is a set of services that enables you to store, process, and analyze medical data in Azure. These services are designed to help organizations quickly connect disparate health data sources and formats, such as structured, imaging, and device data, and normalize it to be persisted in the cloud. At its core, Azure Health Data Services possesses the ability to transform and ingest data into FHIR (Fast Healthcare Interoperability Resources) format. This allows you to transform health data from legacy formats, such as HL7v2 or CDA, or from high-frequency IoT data in device proprietary formats to FHIR. This makes it easier to connect data stored in Azure Health Data Services with services across the Azure ecosystem, like Azure Synapse Analytics, and Azure Machine Learning (Azure ML). From beebb27d5814e3f65e30eb8447dbbcb0c6803124 Mon Sep 17 00:00:00 2001 From: zbyosufzai <145053952+zbyosufzai@users.noreply.github.com> Date: Fri, 8 Mar 2024 11:22:52 -0500 Subject: [PATCH 25/25] fixed links create_index_from_csv.md --- docs/create_index_from_csv.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/create_index_from_csv.md b/docs/create_index_from_csv.md index 5801510..bc1563f 100644 --- a/docs/create_index_from_csv.md +++ b/docs/create_index_from_csv.md @@ -69,7 +69,7 @@ Navigate to `Indexes` on the left panel and wait until your index shows as many ![Check index](/docs/images/10_check_index.png) -And that is it! Now return to [the tutorial notebook to run queries against this csv using GPT-4]( /tutorials/notebooks/GenAI/notebooks/llm_query_csv.ipynb). +And that is it! Now return to [the tutorial notebook to run queries against this csv using GPT-4]( /notebooks/GenAI/notebooks/AzureAIStudio_index_structured_with_console.ipynb).