Formatting numbers for easy readability in R

dataviz
formatting
numbers

Sometimes you want a raw number. Other times, you want a number that you can instantly verbalise. This post explores an easy solution for getting from “1234567890” to “1,234.6M” without installing any additional packages.

Published

January 30, 2024

Last week’s “aha” moment came courtesy of Garrick Aden-Buie who pointed me to a great solution for formatting numbers without requiring any additional packages in R.

For a long time, I have used his grkmisc::pretty_num() function, which does what it says on the tin: it makes numbers pretty, or, easier to read. Here’s a quick demo.

I’ve talked elsewhere about keeping numbers as they are, albeit with a bit of formatting (big marks and monospacing), so that their order of magnitude act as an extra visual cue when comparing them in a list.

  • 123
  • 1,234
  • 1,234,567,890

But in creating visualisations where the main focus is summary statistics and comparisons between groups rather than on precision down to the closest 0.01, sometimes printing 1234567890 as “1,234.6M” is the most helpful thing to do for the end user. So grkmisc::pretty_num() was a really helpful function and I used it a lot!

grkmisc::pretty_num(123456)
#> [1] "123.5k"
grkmisc::pretty_num(4567500)
#> [1] "4.6M"
grkmisc::pretty_num(12345678, decimal_digits = 2)
#> [1] "12.35M"

Until I came across two problems. The first was that it didn’t work for negative numbers:

grkmisc::pretty_num(-123456)
#> [1] "-123456"

And the second was that clients were finding it increasingly tricky to install this package as a dependency of the packages I created for them. And for good reason. {grkmisc} was never really intended as a package for wide adoption, and it has since been superseded in many ways by functions inside {epoxy} which is a much more widely used package. But pretty_num() was nowhere to be seen…

Cue, the conversation I had with Garrick last week. I was keen to keep using the function, so asked if I could pinch the code, change it to work with negative numbers, and add it to my {verbaliseR} package. His response? “Of course, feel free, but actually I’ve since found a better solution: scales::number(11000, scale_cut = scales::cut_long_scale()). That conversation saved me a lot of unnecessary work. Thank you!

If you are using {ggplot}, you already have {scales} installed (it’s a {ggplot} dependency), so this solution also removes the need to install any additional packages. Time to explore what scales::number() can do…

scales::number(12345, 
               scale_cut = scales::cut_long_scale())
#> [1] "12K"

So far so good, but can it do negative numbers?

scales::number(-12345, 
               scale_cut = scales::cut_long_scale())
#> [1] "-12K"

Yes! Ok, but I liked the digit after the decimal place. No problem, we can use the accuracy argument:

scales::number(-12345, 
               accuracy = 0.1, 
               scale_cut = scales::cut_long_scale())
#> [1] "-12.3K"

Great! Now, what if it’s a .0, does it keep that?

scales::number(-12002, 
               accuracy = 0.1, 
               scale_cut = scales::cut_long_scale())
#> [1] "-12.0K"

By default, yes. But we can also pass drop0trailing = TRUE through to base::format() through the ... in the scales::number() if we want to :

scales::number(-12002, 
               accuracy = 0.1, 
               drop0trailing = TRUE, 
               scale_cut = scales::cut_long_scale())
#> [1] "-12K"

Ok, one final thing, what happens when we get past 999M? I hadn’t thought about this until it came up in a parameterised plot I was creating for a client later in the week.

scales::number(1234567890, 
               accuracy = 0.1, 
               drop0trailing = TRUE, 
               scale_cut = scales::cut_long_scale())
#> [1] "1 234.6M"

By default, nothing too surprising, but I missed the formatting as “1,234.6M”. Thankfully, scales::number() also has a big.mark argument, so that was an easy fix!

scales::number(1234567890, 
               accuracy = 0.1, 
               drop0trailing = TRUE, 
               big.mark = ",",
               scale_cut = scales::cut_long_scale())
#> [1] "1,234.6M"

And there we have it, a “zero extra dependencies” solution for formatting numbers for easy readability by humans! It even works on vectors, and we can add prefixes (which end up in the right place with respect to the negative sign!) and suffixes if we want to:


scales::number(c(-12002, 12, 1234, 123456789, 1234567890),
               accuracy = 0.1, 
               drop0trailing = TRUE, 
               big.mark = ",",
               scale_cut = scales::cut_long_scale(),
               prefix = "$",
               suffix = " per penguin")
#> [1] "-$12K per penguin"     "$12 per penguin"       "$1.2K per penguin"    
#> [4] "$123.5M per penguin"   "$1,234.6M per penguin"

Happy days! 💰🐧

Reuse

Citation

For attribution, please cite this work as:
“Formatting Numbers for Easy Readability in R.” 2024. January 30, 2024. https://www.cararthompson.com/posts/2024-01-30-formatting-numbers-in-R/2024-01-30_formatting-numbers-in-R.html.