<<<<<<< HEAD ======= >>>>>>> refs/remotes/origin/main An interdisciplinary journey in data visualisation – Building Stories With Data <<<<<<< HEAD ======= >>>>>>> refs/remotes/origin/main <<<<<<< HEAD ======= >>>>>>> refs/remotes/origin/main <<<<<<< HEAD ======= >>>>>>> refs/remotes/origin/main

An interdisciplinary journey in data visualisation

Keynote talk at the Women in Data Science 2025 conference held at Heriot Watt University. An exploration of how I have ended up doing what I do, peppered with tips and resources for future dataviz projects.

Published

May 28, 2025

Resources

I mentioned a number of resources during my talk. Here they are in the order in which I mentioned them:

Dataviz code

And here is the full code for the visualisation I shared, to tell the story of “10% of one in seven” - source: Esi Hardy, Celebrating Disability

library(ggplot2)

# This keeps the coordinates of the points consistent across iterations...
set.seed(202502)

cd_data <- dplyr::tibble(x_coord = sample(seq(1, 5, length.out = 700), 700),
                         y_coord = sample(seq(1, 4, length.out = 700), 700),
                         identify_d = c(rep("no", 600), rep("yes", 100)))

# ... which allows us to consistently pick out the x_coordinates of 10%
# of the purple dots which fall along a pleasing path across the visualisation
one_in_10 <- c(1.915594, 2.316166, 2.567954, 2.648069,
               2.997139, 3.391989, 3.712446, 4.216023,
               4.851216, 1.291845)

# Creating a background colour from Esi's brand colours
background <- monochromeR::generate_palette("#F4DFD2", "go_lighter", n_colours = 3)[3]

ggplot(cd_data) +
  geom_point(aes(x = x_coord,
                 y = y_coord,
                 fill = identify_d,
                 alpha = dplyr::case_when(round(x_coord, 6) %in% one_in_10 ~ 0.9,
                                          identify_d == "yes" ~ 0.25,
                                          TRUE ~ 0.1),
                 size = sample(1:3, 700, replace = TRUE),
                 colour = dplyr::case_when(round(x_coord, 6) %in% one_in_10 ~ "#FFB700",
                                    TRUE ~ "white"),
                 stroke = dplyr::case_when(round(x_coord, 6) %in% one_in_10 ~ 1.6,
                                    TRUE ~ 0)),
             shape = 21,
             show.legend = FALSE) +
  scale_size(range = c(4.5, 6)) +
  scale_colour_identity() +
  scale_alpha_identity() +
  scale_fill_manual(values = c("no" = "#c86020", "yes" = "#601c8d")) +
  theme_void() +
  theme(plot.background = element_rect(fill = background, colour = background))

Reuse

Citation

For attribution, please cite this work as:
“An Interdisciplinary Journey in Data Visualisation.” 2025. May 28, 2025. https://www.cararthompson.com/talks/women-in-datascience-journey/.
======= } else { // See if we can fetch a full url (with no hash to target) // This is a special case and we should probably do some content thinning / targeting fetch(url) .then(res => res.text()) .then(html => { const parser = new DOMParser(); const htmlDoc = parser.parseFromString(html, "text/html"); const note = htmlDoc.querySelector('main.content'); if (note !== null) { // This should only happen for chapter cross references // (since there is no id in the URL) // remove the first header if (note.children.length > 0 && note.children[0].tagName === "HEADER") { note.children[0].remove(); } const html = processXRef(null, note); instance.setContent(html); } }).finally(() => { instance.enable(); instance.show(); }); } }, function(instance) { }); } let selectedAnnoteEl; const selectorForAnnotation = ( cell, annotation) => { let cellAttr = 'data-code-cell="' + cell + '"'; let lineAttr = 'data-code-annotation="' + annotation + '"'; const selector = 'span[' + cellAttr + '][' + lineAttr + ']'; return selector; } const selectCodeLines = (annoteEl) => { const doc = window.document; const targetCell = annoteEl.getAttribute("data-target-cell"); const targetAnnotation = annoteEl.getAttribute("data-target-annotation"); const annoteSpan = window.document.querySelector(selectorForAnnotation(targetCell, targetAnnotation)); const lines = annoteSpan.getAttribute("data-code-lines").split(","); const lineIds = lines.map((line) => { return targetCell + "-" + line; }) let top = null; let height = null; let parent = null; if (lineIds.length > 0) { //compute the position of the single el (top and bottom and make a div) const el = window.document.getElementById(lineIds[0]); top = el.offsetTop; height = el.offsetHeight; parent = el.parentElement.parentElement; if (lineIds.length > 1) { const lastEl = window.document.getElementById(lineIds[lineIds.length - 1]); const bottom = lastEl.offsetTop + lastEl.offsetHeight; height = bottom - top; } if (top !== null && height !== null && parent !== null) { // cook up a div (if necessary) and position it let div = window.document.getElementById("code-annotation-line-highlight"); if (div === null) { div = window.document.createElement("div"); div.setAttribute("id", "code-annotation-line-highlight"); div.style.position = 'absolute'; parent.appendChild(div); } div.style.top = top - 2 + "px"; div.style.height = height + 4 + "px"; div.style.left = 0; let gutterDiv = window.document.getElementById("code-annotation-line-highlight-gutter"); if (gutterDiv === null) { gutterDiv = window.document.createElement("div"); gutterDiv.setAttribute("id", "code-annotation-line-highlight-gutter"); gutterDiv.style.position = 'absolute'; const codeCell = window.document.getElementById(targetCell); const gutter = codeCell.querySelector('.code-annotation-gutter'); gutter.appendChild(gutterDiv); } gutterDiv.style.top = top - 2 + "px"; gutterDiv.style.height = height + 4 + "px"; } selectedAnnoteEl = annoteEl; } }; const unselectCodeLines = () => { const elementsIds = ["code-annotation-line-highlight", "code-annotation-line-highlight-gutter"]; elementsIds.forEach((elId) => { const div = window.document.getElementById(elId); if (div) { div.remove(); } }); selectedAnnoteEl = undefined; }; // Handle positioning of the toggle window.addEventListener( "resize", throttle(() => { elRect = undefined; if (selectedAnnoteEl) { selectCodeLines(selectedAnnoteEl); } }, 10) ); function throttle(fn, ms) { let throttle = false; let timer; return (...args) => { if(!throttle) { // first call gets through fn.apply(this, args); throttle = true; } else { // all the others get throttled if(timer) clearTimeout(timer); // cancel #2 timer = setTimeout(() => { fn.apply(this, args); timer = throttle = false; }, ms); } }; } // Attach click handler to the DT const annoteDls = window.document.querySelectorAll('dt[data-target-cell]'); for (const annoteDlNode of annoteDls) { annoteDlNode.addEventListener('click', (event) => { const clickedEl = event.target; if (clickedEl !== selectedAnnoteEl) { unselectCodeLines(); const activeEl = window.document.querySelector('dt[data-target-cell].code-annotation-active'); if (activeEl) { activeEl.classList.remove('code-annotation-active'); } selectCodeLines(clickedEl); clickedEl.classList.add('code-annotation-active'); } else { // Unselect the line unselectCodeLines(); clickedEl.classList.remove('code-annotation-active'); } }); } const findCites = (el) => { const parentEl = el.parentElement; if (parentEl) { const cites = parentEl.dataset.cites; if (cites) { return { el, cites: cites.split(' ') }; } else { return findCites(el.parentElement) } } else { return undefined; } }; var bibliorefs = window.document.querySelectorAll('a[role="doc-biblioref"]'); for (var i=0; i >>>>>>> refs/remotes/origin/main