• Using Amazon EC2 Container Service

    Amazon ECS, or EC2 Container Service is a Container Management Service for Docker containers. Similar to Kubernetes in intent, the service allows users to provision Docker containers in a fully managed cluster of EC2s. This post is a quick summary of how to get up and running with your own ECS cluster.

    The motivation behind containers is to optimize the usage of underlying resources like CPU and Memory. Containerized infrastructure provides a dense compute environment, allowing us to pack more usage without having to spend $$ for idle/underutilized resources.

    Read on →

  • Apache Nutch - Step by Step

    Search is one of the most fantastic areas of the technology industry, and has been addressed many, many times with different algorithms, producing varying degrees of success. We get so used to it, that often times I wish I had a Cmd-F while reading a real book.

    Recently we had our Quarterly Hack Week at Marqeta, and one of the ideas was to build search around our public pages. These pages would include the public website assets, as well as the the API developer guides and documentation. This post is a quick summary of the infrastructure, setup, and gotchas of using Nutch 2.3.1 to build a site search - essentially notes from this hack week project.


    Read on →

  • Caching - Gotchas & Lessons Learned

    It has been said time and again - “There are only two hard things in Computer Science: cache invalidation and naming things”. Having run into both of these problems in my professional career, I figured I could write a post, summarizing the lessons I have learned along the way by seeing and building various caching architectures across many companies, big and small.

    Just like threading, caching is easy to code, but often creates more problems than it intends to solve. These problems can arise from - you guessed it - invalidation, sub-par efficiency, inconsistency, and many more. It is also one of my favorite topics for technical interviews :)

    Read on →

  • Boto 3 and SQS

    Boto 3 is the AWS SDK for Python. In fact, this SDK is the reason I picked up Python - so I can do stuff with AWS with a few lines of Python in a script instead of a full blown Java setup. Its fun, easy, and pretty much feels like working on a CLI with a rich programming language to back it up. In this post we will use SQS and boto 3 to perform basic operations on the service.

    Read on →

  • Python Primer for Java Developers

    I took these notes while ramping up on Python, to be able to contribute to a few Github projects, deploy AWS Lambdas quicker, and use boto3 for quite a few projects. Plus I find it a very light, frictionless, quick and easy scripting language to have in the arsenal. This information can be particularly useful to anyone who is coming from a Java background, and wants to compare and contrast the two languages’ basic constructs and syntax.

    Read on →