Skip to content

Commit

Permalink
Update README and add plots to examples. Also clean up the cord19 exa…
Browse files Browse the repository at this point in the history
…mple.
  • Loading branch information
lmcinnes committed Jan 7, 2024
1 parent 3ceefbc commit 0fc9728
Show file tree
Hide file tree
Showing 7 changed files with 95 additions and 11 deletions.
85 changes: 81 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,84 @@
DataMapPlot
===========

Creating beautiful plots of data maps. This provides basic tools for generating presentation or publication worthy
static plots of labelled data maps. All you need to do is label clusters of points in the data map and DataMapPlot
will take care of the rest. There are a number of options for tweaking the results, but the aim is to have something
good-looking straight out-of-the-box.
Creating beautiful plots of data maps. DataMapPlot is a small library designed to help you make beautiful data map
plots for inclusion in presentations, posters and papers. The focus is on producing static plots that are great
looking with as little work for you as possible. All you need to do is label clusters of points in the data map and
DataMapPlot will take care of the rest. While this involves automating most of the aesthetic choices, the library
provides a wide variety of ways to customize the resulting plot to your needs.

--------
Examples
--------

Some examples of the kind of output that DataMapPlot can provide.

A basic plot, with soem highlighted labels:

.. image:: examples/plot_cord19.png
:width: 1024
:alt: A data map plot of the CORD-19 dataset
:align: center

Using darkmode and some custom font choices:

.. image:: examples/plot_arxiv_ml.png
:width: 1024
:alt: A data map plot of papers from ArXiv ML
:align: center

Alternative custom styling:

.. image:: examples/plot_wikipedia.png
:width: 1024
:alt: A data map plot of Simple Wikipedia
:align: center

Custom arrow styles, fonts, and colour maps:

.. image:: examples/plot_simple_arxiv.png
:width: 1024
:alt: A styled data map plot of papers from ArXiv ML
:align: center

------------
Installation
------------

DataMapPlot requires a few libraries, but all are widely available and easy to install:

* Numpy
* Matplotlib
* Scikit-learn
* Pandas
* Datashader
* Scikit-image
* Numba

To install DataMapPlot you can use pip:

.. code:: bash
pip install datamapplot
or use conda with conda-forge

.. code:: bash
conda install -c conda-forge datamapplot
-------
License
-------

fast_hdbscan is MIT licensed. See the LICENSE file for details.

------------
Contributing
------------

Contributions are more than welcome! If you have ideas for features of projects please get in touch. Everything from
code to notebooks to examples and documentation are all *equally valuable* so please don't feel you can't contribute.
To contribute please `fork the project <https://github.com/TutteInstitute/datamapplot/issues#fork-destination-box>`_ make your
changes and submit a pull request. We will do our best to work through any issues with you and get your code merged in.
10 changes: 5 additions & 5 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,11 @@
DataMapPlot: Creating beautiful plot of data maps
=================================================

DataMapPlot is a small library designed to help you make beautiful data map plots for
inclusion in presentations, posters and papers. The focus is on producing static plots
that are great looking with as little work for you as possible. While this involves
automating most of the aesthetic choices, the library provides a wide variety of ways
to customize the resulting plot to your needs.
Creating beautiful plots of data maps. DataMapPlot is a small library designed to help you make beautiful data map
plots for inclusion in presentations, posters and papers. The focus is on producing static plots that are great
looking with as little work for you as possible. All you need to do is label clusters of points in the data map and
DataMapPlot will take care of the rest. While this involves automating most of the aesthetic choices, the library
provides a wide variety of ways to customize the resulting plot to your needs.

Installation
------------
Expand Down
Binary file added examples/plot_arxiv_ml.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/plot_cord19.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 9 additions & 2 deletions examples/plot_cord19.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,19 @@
import requests
import PIL
import matplotlib.pyplot as plt
import pandas as pd

plt.rcParams['savefig.bbox'] = 'tight'

cord19_data_map = np.load("CORD19-subset-data-map.npy")
cord19_labels = np.load("CORD19-subset-cluster_labels.npy", allow_pickle=True)

# Prune labels down slightly
label_counts = pd.Series(cord19_labels).value_counts()
small_clusters = label_counts[label_counts <= 700].index
for label in small_clusters:
cord19_labels[cord19_labels == label] = "Unlabelled"

allenai_logo_response = requests.get(
"https://allenai.org/newsletters/archive/2023-03-newsletter_files/927c3ca8-6c75-862c-ee5d-81703ef10a8d.png",
stream=True,
Expand All @@ -30,12 +37,12 @@
highlight_labels=[
"Effects of the COVID-19 pandemic on mental health",
"Airborne Transmission of COVID19",
"Diagnostic Testing for SARS-CoV2",
"COVID19 Diagnosis",
"Viral Diseases and Emerging Zoonoses",
"Vaccine Acceptance",
],
label_font_size=6,
label_margin_factor=1.75,
label_margin_factor=1.5,
label_direction_bias=1.0,
highlight_label_keywords={"fontsize": 12, "fontweight": "bold", "bbox": {"boxstyle": "circle", "pad": 0.75}},
logo=allenai_logo,
Expand Down
Binary file added examples/plot_simple_arxiv.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added examples/plot_wikipedia.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 0fc9728

Please sign in to comment.