Here you'll find mostly packages for the R programming language I've been working on. Some of them have been created for the European Data Journalism Network - EDJNet, many others have been developed for a variety of more or less serious reasons.
They include:
tidywikidatar
- Interact with Wikidata and get tidy data frames in response (it's also on CRAN)ganttrify
- Create beautiful Gantt charts with ggplot2 (far from perfect, but apparently very popular, with hundreds of Github users kind enough to give it a star)castarter.legacy
- castarter - Content analysis starter toolkit for R (it's the first "big" package I made for R... it shows my inexperience when I wrote most of the codebase back in 2015, but it still works nicely and colleagues and I use it quite regularly; it has plenty of functionalities, and even if it could use a more comprehensive vignette, it should still be usable by novice users)castarter
- this is a more modern, fully-featured, and consistent iteration of castarter - Content Analysis Starter Toolkit for the R programming language. It facilitates text mining and web scraping by taking care of many of the most common file management issues, keeps tracks of download advancement in a local database, facilitates extraction through dedicated convenience functions, and allows for basic exploration of textual corpora through a Shiny interface. It is currently under active development and not yet fully functional.ytdlpr
- R wrapper for yt-dlp, focused on extracting and processing subtitles of videos posted on YouTube, allowing e.g. to extract all video segments posted by a user including a given keyword.latlon2map
- Facilitates matching lat/lon data with administrative units and other geographic shapes (it also includes a lot of convenience functions for downloading and caching geo-spatial datasets... not a beauty, but it gets its job done and I use it in so many of my everyday projects)rbackupr
- An R package to backup to Google Drive with limited permissions, useful e.g. for uploads from remote servers; speedy, thanks to local caching of metadata (not fully documented, but reasonably functional)nomnomlgraph
- Createnomnoml
diagrams in R based on data frames with edges and nodesriskviewer
- riskviewer - Show risks and probability in real world contexts (conceptually, this may be one of the most valuable things I've worked on. Check out the theoretical background - and give a quick spin to the shiny app showcasing a basic functionality - unfortunately, the package does not yet work consistently but I hope to make it better)networkedwebsitesdetector
- A structured approach for finding networked websites (I don't even know what to say... this is kind of great, but also I never had the time to really polish and finalise it, so...)genderedstreetnames
- Automatically find the gender of street names, manually fix what the automatic part got wrong.streetnamer
- match street names to people or object they are dedicated to - not fully polished, but this is an advanced project, with a functioning shiny interface.shinyshoppinglist
- A shopping list app made in R shiny (this is really basic, but also, it actually works)cornucopia
- Facilitate reporting on sponsored and organic activities on Facebook, Instagram, and LinkedIn (will possibly include other platforms)
zoteror
- Access the Zotero API in R (it does what it says on the tin, reasonably functional with a clear README... I may put it on CRAN one day)plausibler
- Access Plausible Analytics API from Rhuecontroller
- Control Philips Hue lights using the R programming language (so... one night I was on my couch, and wanted to soften the lights, but didn't want to get up, so I ended up writing an R package to control lights... it even has a reasonably functional Shiny app, but it's only intensity and warmth for the time being - setting colour is possible, but not yet integrated in the shiny app)
- Prigozhin audio files, transcribed - An automatic transcription of all the audio messages posted on Prigozhin’s official Telegram channel
- textual datasets (mostly) from Russia - available in full, or as a metadata only. A more formal release will follow.
tifkremlinen
- A corpus with all items published on the website of the Kremlin (1999-2020) - (it even has a hex logo, and a DOI, so it's surely serious stuff; to be honest, I do have big plans about expanding this one)olympics2020nuts
- Retrieve details about Olympics 2020 medalists via Wikipedia and Wikidata (check out the readme for more details as well as this nice map)european_routes
- Matching data from Eurostat's datasets on flights to hubs with coordinates and data on train routes, see also step-by-step processlau_centres
- Population-weighted centres of local administrative units and consistent concordance with NUTS regions (website with all details)- When Europeans go the cinema, what do they watch? - An interactive exploration of cinema-goers' habits in Europe based on twenty years of data (1996-2016) on 40 996 films (just visuals, no dataset).
gpx2pdf
- R package and shiny interface to create a pdf printout based on a gpx track (you know, with elevation charts and black and white maps? I made it for a very specific project and started to transform into an R package but it still shows that it's half way through... perhaps still useful if somebody aims to achieve something like it)
I am happy to have contributed to some of the most used R packages. These are small contributions, but I remain nonetheless proud to be featured in the "acknowledgments" section of the release notes of tidyr
(version 1.0), readr
(version 2.0), and dbplyr
(version 2.3.0). I have also contributed to other packages, such as workflowr
, rtweet
, labeleR
, and wikidataR
, and reported confirmed bugs in others, including arrow
and fs
.
- Animating ‘One Degree Warmer’ time series with ggplot2 and gganimate - 9 November 2018
- European Elections 2019 and Italy's varying size - 11 June 2019
- How to feel lucky on a Monday morning: calculating the travel distance between places and each point of the European population grid - 27 November 2019
- How to find the population-weighted centre of local administrative units - 27 March 2020
- Beautiful Gantt charts with ggplot2 - 4 June 2020
- Google Earth Studio as a data visualisation tool (with R) - 8 October 2020
- A new R package for exploring the wealth of information stored by Wikidata: tidywikidatar - 23 April 2021
- Visualising risk: a modern implementation of the Risk Characterisation Theatre - 29 April 2021
- Finding gendered street names. A step-by-step walkthrough with R - 16 July 2021
- The data you need to win the Olympics if you go NUTS - 3 August 2021
- I maintain a reasonably updated Docker image of Omeka S
- I am one of the main translators into Italian of Omeka S and Tropy
- Text as data & data in the text - Studying conflicts in post-Soviet spaces through structured analysis of textual contents available on-line - tadadit.xyz
More about me on my website, giorgiocomai.eu