Eric Ma

pyjanitor: Clean APIs for Cleaning Data

short summary of your topic. Data cleaning with the `pandas` API can sometimes be confusing. I will describe team efforts on `pyjanitor`, to create a cleaner API for data cleaning, focusing on readability and expressivity. `pyjanitor` is a port of the R `janitor` package, and provides a cleaner, verb-based method-chaining API to pandas users. I will explain `pyjanitor`'s history and design, and provide a suite of examples coded in Jupyter notebooks on how to use `pyjanitor`'s data cleaning functions. I will also highlight how to easily contribute routine data cleaning functions into the library, and issue a call-to-action for newcomers to contribute at the sprints!

Institutional Sponsor

  • Black Twitter Icon