My data driven journey to finding our dream home

As many of you know, property is one of my many hobbies. In fact, I even wrote a guide on buying property, which you can check out here. Fortunately, my wife shares this passion with me. About a year ago, we made the decision that Auckland is our forever city. With most of our family and friends here, the lifestyle we want, and the high quality of life it offers (relative), it felt like the ideal place to settle down and raise a family in the "kiwi" style we grew up with (maybe minus walking barefoot these days), even if it came at the sacrifice of career opportunities and vibrancy larger cities overseas offer.

With that decision made, we embarked on a mission to find our dream home—or land to build on—within Auckland, no matter how long it would take, and a cheque book to suit. Between the two of us, we had about 20 different requirements and criteria. However, going to open homes repeatedly made me mumble about scalability and what not...

To tackle this, I turned to data. Even if we could eliminate half of our requirements easily, it would speed up our house search. About half of our requirements focus on the house itself, but the other half were about the area and immediate surroundings—things like proximity to neighboring houses, which I figured I could infer using data. With a wealth of open data available from Auckland Council, LINZ, and other sources, I spent a few months building a PostgreSQL database (thanks, Supabase!) of every single property across Auckland, complete with their key features (contours, land features, natural hazards, distance to schools, proximity to Bunnings... you get the point). From there, I developed a mobile app that allowed us to quickly assess homes listed for sale.


It wasn't for the faint hearted. Processing nearly a hundred different data sources containing a million plus records each, including procurement, quality checks, standardization (I painfully learnt what a CRS was and how it differed across many geospatial datasets), and ingestion. I had to go back to my university textbooks and study Big-Os for time and space complexity to not fry my meagerly computer. All part of the fun however! 

The hard work paid off - it made house hunting much easier, but I knew I could do even more to streamline the process. Enter a full web app that lets us search and filter properties based on detailed characteristics. While commercial options like CoreLogic and Re-Leased exist, they’re not freely accessible and don’t offer the precise filters we wanted—such as proximity to neighbors or avoiding corner properties. Plus, the ability to add likes, dislikes, and notes allows my wife and me to collaborate seamlessly, whether we’re evaluating homes together or apart.

This approach has completely flipped our house-hunting process. Instead of reactively attending open homes, we’re now proactively identifying properties that fit our criteria.



Coming up next is building a machine learning model using Google's Streetview APIs to identify whether the property "looks" good, right now I'm starting with the basics of identifying the cladding (a simple Stucco vs Not Stucco) and eventually extending it out to see if the front profile of the house matches our desires. Stay tuned!



The bad news? There are probably less than 300 properties out of all one million we'd seriously consider, and that's without a site visit to look at the inside! I suppose there's another lesson to be learnt there...