• OS-X ctrl + l (alternative: clear) it is not really clean screen, it just move log up out current screen so you still able to scroll it.  When you press command + k it really clear console.
  • cd -; get back to the directory u were previously in.
  • echo; kind of printf
  • (cd /tmp && ls)
    Jump to a directory, execute a command and jump back to current dir
  • tail – output the last part of files (default is 10 lines)
  • ps aux | grep firefox;
  • ps aux | sort -b -k  +4 | tail; Display the top ten running processes – sorted by memory usage
  • ps aux | sort -b -k  +3 | tail; Display the top ten running processes – sorted by CPU usage;
  • sort -k position (start with 1)
  • echo “!!” >; Create a script of the last executed command
  • history; list of history commands
  • !5; run the 5th command in the history list
  • rm !(*.foo|*.bar|*.baz)
  • pgrep firefox = Get the PID of a process by name
  • ls -R | grep “:$” | sed -e ‘s/:$//’ -e ‘s/[^-][^\/]*\//–/g’ -e ‘s/^/ /’ -e ‘s/-/|/’;  same as tree command, but there is no “tree” command on mac OS;
  • mkdir -p a/long/directory/path
    Make directory including intermediate directories;you can also use the list notation:mkdir -p root/child0/child1/{ab,cd,de,ef}will create 4 directories named ‘ab’, ‘cd’, ‘de’, ‘ef’ in root/child0/child1.

  • cat ~/.ssh/ | ssh user@machine “mkdir ~/.ssh; cat >> ~/.ssh/authorized_keys”; Copy your SSH public key on a remote machine for passwordless login; Copy your ssh public key to a server from a machine that doesn’t have ssh-copy-id


vim setup

auto complete

” Autocomplete already-existing words in the file with tab (extremely useful!)
function InsertTabWrapper()
let col = col(‘.’) – 1
if !col || getline(‘.’)[col – 1] !~ ‘\k’
return “\<tab>”
return “\<c-p>”
endfunction  “define a function
inoremap <tab> <c-r>=InsertTabWrapper()<cr> “define <tab><c-r> to trigger the function



  • auto time=11_ms; (auto config the variable type)
  • for(string s : vec) {} (range loops)
  • hashtables introduced (std::set and std::map are not hashtables, they are binary trees)
  • []{}() (lamabda)C++11 provides the ability to create anonymous functions, called lambda functions.[9] These are defined as follows:
    [](int x, int y) { return x + y; }
  • rvalues ( C++11 adds a new non-const reference type called an rvalue reference, identified by T&&. This refers to temporaries that are permitted to be modified after they are initialized, for the purpose of allowing “move semantics”. )
  • unique_ptr which is like smart pointers (shared_ptr) but has only one reference. It enables auto-delete when there is no pointer pointing to it. std::auto_ptr is deprecated.
  • variadic templates: Prior to C++11, templates (classes and functions) can only take a fixed number of arguments that have to be specified when a template is first declared. C++11 allows template definitions to take an arbitrary number of arguments of any type.


HDFS=hadoop distributed file system

Hadoop = HDFS + MapReduce

Hive = provide a SQL layer over HDFS or other file system


Apache Giraph is an Apache project to perform graph processing on big data. Giraph utilizes Apache Hadoop‘s MapReduce implementation to process graphs. Facebook used Giraph with some performance improvements to analyze one trillion edges using 200 machines in 4 minute