An HDFS Tutorial for Data Analysts Stuck With Relational Databases

Introduction

By now, you have probably heard of the Hadoop Distributed File System (HDFS), especially if you are data analyst or someone who is responsible for moving data from one system to another. However, what are the benefits that HDFS has over relational databases?

HDFS is a scalable, open source solution for storing and processing large volumes of data. HDFS has been proven to be reliable and efficient across many modern data centers.

HDFS utilizes commodity hardware along with open source software to reduce the overall cost per byte of storage.

With its built-in replication and resilience to disk failures, HDFS is an ideal system for storing and processing data for analytics. It does not require the underpinnings and overhead to support transaction atomicity, consistency, isolation, and durability (ACID) as is necessary with traditional relational database systems.

Moreover, when compared with enterprise and commercial databases, such as Oracle, utilizing Hadoop as the analytics platform avoids any extra licensing costs.

One of the questions many people ask when first learning about HDFS is: How do I get my existing data into the HDFS?

In this article, we will examine how to import data from a PostgreSQL database into HDFS. We will use Apache Sqoop, which is currently the most efficient, open source solution to transfer data between HDFS and relational database systems. Apache Sqoop is designed to bulk-load data from a relational database to the HDFS (import) and to bulk-write data from the HDFS to a relational database (export).

Keep Reading

Landing Page Design: Building the Ultimate Landing Page

When I started to implement the Ultimate Hacking Keyboard, I wasn’t very marketing savvy. As an engineer, all I could see ahead was product development and technical challenges. However, marketing is just as important and must not be overlooked. A good landing page is a must-have.

Luckily for us, we realized that there’s a lot to do before we start our crowdfunding campaign, and an attractive site could turn this otherwise idle time to our advantage by capturing people’s attention, generating more subscribers and priming us for the campaign.

Keep Reading

How To Improve ASP.NET App Performance In Web Farm With Caching

A Brief Introduction to Caching

Caching is a powerful technique for increasing performance through a simple trick: Instead of doing expensive work (like a complicated calculation or complex database query) every time we need a result, the system can store – or cache – the result of that work and simply supply it the next time it is requested without needing to reperform that work (and can, therefore, respond tremendously faster).

Keep Reading

Webpack or Browserify & Gulp: Which Is Better?

As web applications grow increasingly complex, making your web app scalable becomes of the utmost importance. Whereas in the past writing ad-hoc JavaScript and jQuery would suffice, nowadays building a web app requires a much greater degree of discipline and formal software development practices, such as:

  • Unit tests to ensure modifications to your code don’t break existing functionality
  • Linting to ensure consistent coding style free of errors
  • Production builds that differ from development builds

The web also provides some of its own unique development challenges. For example, since webpages make a lot of asynchronous requests, your web app’s performance can be significantly degraded from having to request hundreds of JS and CSS files, each with their own tiny overhead (headers, handshakes, and so on). This particular issue can often be addressed by bundling the files together, so you’re only requesting a single bundled JS and CSS file rather than hundreds of individual ones.

Bundling tools tradeoffs: Webpack vs Browserify

Which bundling tool should you use: Webpack or Browserify + Gulp? Here is the guide to choosing.

Keep Reading

Build Ultra-Modern Web Apps with Angular Material

At the Google I/O Conference back in 2014, Google announced Material Design, their new design language. They have since converted much of their popular applications to adhere to this new spec in an effort to provide a consistent experience. Now they are trying to convince you to follow along as well.

Angular Material: Superheroic Javascript Framework Meets Ultra-Modern Design

What is Material Design?

After a visit to the official Material Design spec, you will immediately get a feeling of ultra-modern minimalism. Basic shapes and flat colors are the theme here. Going through the documentation is quite an experience. I recommend taking a look for yourself, but I will summarize it here.

Keep Reading