Categories
hadoop
- Big data anti-patterns presentation
- Using the libjars option with Hadoop
- Configuring and tuning MapReduce's shuffle
- Bucketing, multiplexing and combining in Hadoop - part 1
- Slurper v2
- Next Generation Hadoop - It's Not Just Batch!
- Sorting text files with MapReduce
- Hadoop unit testing with MiniMRCluster and MiniDFSCluster
- Controlling user logging in Hadoop
- Simplifying secondary sorting in MapReduce with htuple
- Hadoop in Practice, Second Edition
- Bucketing, multiplexing and combining in Hadoop - part 2
- Secondary sorting with Avro
- Configuring memory for MapReduce running on YARN
- Using Hadoop 2.2 as a sink in Flume 1.4
- Using Oozie 4.4.0 with Hadoop 2.2
- Understanding how Parquet integrates with Avro, Thrift and Protocol Buffers
- Using awk and friends with Hadoop
*nix
- LZOP decompression - revenge of the useless cat
- Executing variables that contain shell operators
- OSX, Chrome and DNS
- Lexicographically sorting large files in Linux
- Pipes and useless cats
- How partitioning, collecting and spilling work in MapReduce
- Using sed to perform inline replacements of regex groups
- Bare-metal installation for Nginx and Jekyll