Hi! I’m Matthew. I've spent a decade working with big data.

Partner with me to unlock the power of your data, and make the most out of your big data infrastructure. Get a Quote

My Services

Data Infrastructure Architecture

To solve data-intensive problems, you need serious data infrastructure. From standing up distributed data storage, to tuning parallel runtime engines, you can count on me to deliver.

Data Engineering

I build the whole data engineering stack: Sourcing datasets, developing ETL pipelines, and delivering reliable production outputs.

Data Analytics

If you have a treasure trove of valuable data, I can help you access it. From data warehouse (and data lake) development, through analytics application set-up and training, I can do it all.

GDPR and CCPA Implementation

If you store any user-data, you need a data governance system. By combining my legal background and technical know-how, I build systems that clean user data, provide audits, generate automated documentation, and are capable of self-validation.

Cloud Migrations

Moving distributed systems to (or from) a cloud vendor is intensive work. Let me plan and execute the project for you, so you can get back to work.

Training and Onboarding

If you have an engineering team that is new to big data, I can get them up to speed fast. My blog is a great example of my ability to teach these concepts.

Why You Should Hire Me

I've spent a decade working with big data. I can help you get the results you need.

Learning to ingest, store, process, and operationalize big data can take years. I have first hand experience delivering quality applications in a range of businesses.

I can help you answer questions like

  • What analytics software should my team use?
  • How should we redact data for the GDPR and CCPA?
  • What is the most cost effective cloud offering?
  • How can we identify useful data?
  • What data warehouse makes the most sense for our business?
  • What format should my data be stored in?
  • How should we secure our big data?
  • Should we be using Spark, or something different?
  • How do we effectively process my data?
  • How do we debug problems in my ETL pipelines?
  • How do we get my internal teams up to speed?
Contact Me for a Quote

About Me

I spend my workdays helping businesses get to grips with their ‘big data’. That usually means architecting, building, deploying, and monitoring data pipelines using Spark, Kafka, Hadoop, and more.

I’m a father and husband, and would not be capable of doing anything useful without my super smart and supportive wife, Alexandra.

Prior to Rathbone Labs, I founded and ran a data analytics start-up (Beekeeper Data), ran a team as VP of Engineering at Kickbox, designed and built distributed software at Shoutlet (acquired by Spredfast), built large scale analytics infrastructure at Foursquare, and real-time analytics software at Drop.io (acquired by Facebook).

I make technical tutorials on my blog, founded Big Data Madison, and have a MS in Computer Science from the Courant School of Mathematics at NYU.