Posts | Emily Riederer

Industry information management for causal inference

Proactive collection of data to comply or confront assumptions

May 30, 2023 causal, data

Crosspost: The Art of Abstraction in ETL

Rounding out my three-part ETL series form Airbyte’s developer blog

May 3, 2023 data, workflow

The Art of Abstraction in ETL: Dodging Data Extraction Errors

Cross-post from guest post on Airbyte’s developer blog

Mar 22, 2023 data, workflow

Goin' to Carolina in my mind (or on my hard drive)

Out-of-memory processing of North Carolina’s voter file with DuckDB and Apache Arrow

Sep 25, 2022 data, sql

Oh, I'm sure it's probably nothing

How we do (or don’t) think about null values and why the polyglot push makes it all the more important

Sep 5, 2022 rstats, python, sql, data

Update: grouped data quality check PR merged to dbt-utils

After a prior post on the merits of grouped data quality checks, I demo my newly merged implementation for dbt

Aug 26, 2022 data, changelog, dbt

Using databases with Shiny

Key issues when adding persistent storage to a Shiny application, featuring {golem} app development and Digital Ocean serving

Jan 2, 2022 rstats, shiny, data

How to Make R Markdown Snow

Much like ice sculpting, applying powertools to absolutely frivolous pursuits

Dec 11, 2021 rstats, rmarkdown

Make grouping a first-class citizen in data quality checks

Which of these numbers doesn’t belong? -1, 0, 1, NA. You can’t judge data quality without data context, so our tools should enable as much context as possible.

Nov 27, 2021 data

Why machine learning hates vegetables

A personal encounter with ‘intelligent’ data products gone wrong

Nov 10, 2021 data-disasters