Your Blog

The story of the sneaky zeros

For my master’s project, I was using a dataset which provided me with a large collection of LLM output text, alongside a numerical label indicating how significant of a hallucination was contained in the text (the consistency rating) for some of the ... read more →

Dirty crime data: a case study of the chicago crime dataset

Even Crime Data Can’t Stay Clean We pulled the Chicago Crime dataset straight from the city’s open data portal using a simple curl command. It’s public, it’s official, and like most real-world data it’s messier than it looks. curl -L -o Chicago_Crim ... read more →

Bad naming is bad

Give a dog a bad name This is a test post. We at Fancy Company treat our data very badly. For example, we never give names to variables x y z 1 2 3 But what are x y and z? and then forget it don't know, don't remember it read more →

The story of the sneaky zeros

The story of the sneaky zeros

Dirty Crime Data: A Case Study of the Chicago Crime Dataset

Dirty crime data: a case study of the chicago crime dataset

Bad naming is bad

Bad naming is bad