Monday, March 13, 2017

Docker: delete all tags of image

Handy one liner to delete all tags of a particular docker image (on Linux):

docker images | grep rabbitmq | tr -s ' ' | cut -d ' ' -f 2 | xargs -I {} docker rmi{}  

Here are the image tags that this one-liner removes:

thusharaw@denali ois-rabbitmq (thushara/rabbitv1)*$ docker images | grep rabbitmq                   2017031300          6fb1ed865ae6        3 hours ago         179 MB                   2017031307          6fb1ed865ae6        3 hours ago         179 MB                   2017031007          21b84ae0586e        2 days ago          179 MB                   201703100           43d51d243631        2 days ago          179 MB                   201703107           08927efd735e        2 days ago          179 MB                   201703109           646cb7852d8e        2 days ago          179 MB

Let's break down the on-liner:

1) docker images => 

get the images

2) grep rabbitmq

find the tags you care about

3) tr -s ' '

convert all repeating contiguous spaces to a single space so we can easily index to a specific column, the tag in this case

4) xargs -I {} docker rmi{} 

pass the thusly recovered tag to the `docker rmi` command, but use xargs to change the stdout of the previous cmd to a cmd line arg (which is what `docker rmi` works with)

Saturday, August 13, 2016

Java : Don't compare Integer objects with ==

In Java, == operator, if used between objects, compares their references. However, if you compare an object and a primitive type, Java "unboxes" the object to the primitive type and compares the primitives. This auto-conversion is a very pleasant feature of the language, but unfortunately, there is no unboxing if both types being compared to are objects.

So comparing two Integer objects will not work.

Comparing the intValue() on the objects is the way to go.

Sometimes, we may forget that two integer objects are being compared. Think of using a count as the value of a hash map. This is an integer, but due to the way generics are implemented, the value needs to be an Integer object. Comparing counts will need to use the intValue of the Integer object, such as needed for sorting the map by count.

The other relational operators (>, <, >=, <=) work as one might more intuitively expect. They use the obj) method of the Integer class and work on the integer value held in the Integer instances being compared.

It is not clear why the Java language designers chose to wire == operator very differently from the other relational operators.

So the lesson is that whenever you work with Integer instances, all relational operators except the equality operator (==) works in the intuitive sense of comparing the integers. For equality, you could use one of the following:

boolean Integer.equals(Object obj)  

int Integer.compareTo(Integer another)

integerObj1.intValue() == integerObj2.intValue()

Wednesday, July 27, 2016

Optimal Solutions

Here is a Hackerrank exercise that allows us to think of improving program speed in multiple ways.
This is my solution, after a few attempts trying to pass the last two test cases.

    import java.util.*;
    public class Contiguous {
        public static void main(String[] args) {
            Scanner in = new Scanner(;
            Deque<Integer> deque = new ArrayDeque<Integer>();
            int n = in.nextInt();
            int m = in.nextInt();

            int[] arr = new int[n];
            int maxUnqs = 0;
            int unqs = 0;
            Map<Integer, Integer> map = new HashMap<Integer, Integer>();
            for (int i = 0; i < n; i++) {
                int num = in.nextInt();
                deque.add(num); //adds to tail
                //count occurrence of each number
                Integer c = map.get(num); 
                if (c == null || c == 0) {
                } else {
                    map.put(num, c+1);
                if (deque.size() == m) { //we have added m elements; go compute uniques
                    if (unqs == m) {
                        maxUnqs = m; //we can't do better
                    if (unqs > maxUnqs) {
                        maxUnqs = unqs;
                    int first = deque.remove(); //we don't need the head any more; keep queue small
                    int firstCount = map.get(first);
                    if (firstCount == 1) { //the head had the only occurrence of the number, so uniques drop
                    map.put(first, firstCount-1); //one less occurrence of the first element now
            System.out.format("%d\n", maxUnqs);

What strikes me here is that a lot of different things are being done under a single loop. The code is therefore hard to read.

If the code is made to read well, we lose the speed.

The most readable code will do something like this:

1) Read the input into a collection (possibly an array)
2) Make sub arrays of size m, starting from position 0, ending at position n-m
3) For each sub array, count the uniques -> we could use a bitmap, a boolean array, or a hashmap for this
4) Keep track of the maximum unique count

However, this readable algorithm runs in O(nm) time, whereas the solution given above is O(n). The optimal solution keeps track of up to m integers read, and when the m+1 integer is read, it simply removes the effects of the first integer from the system. Thus we can go through each integer only once and solve the problem.

But this alone was not enough to pass the last two test cases. If we don't update <unqs> variable for each loop iteration, but compute it from the <map> each time we have  a queue of m items, the code is still too slow to pass, although the time complexity is the same. While the time complexity is the same, we roughly double the work here and times out.

This slower solution would, on each mth item go over the map, counting keys that mapped to non-zero values. This would take m steps, but since we do this only n/m times, the time complexity would still be O(nm), but it would mean that for each n/m time slot, we do m+m = 2m work, effectively taxing the CPU twice.

Another solution I saw uses the map, but removes an element when its count drops to zero. Now we can use map.size() to calculate uniques. Since the map size can be read in constant time, this solution works as well to pass the tests.

All this led me to think about this Code Katta. The goals we set out will determine the kind of program we write.

Having said that, I believe it is possible to make this optimal program slightly more readable. If we encapsulate the state of the window of integers we are dealing with into a SlidingWindow class, we could implement four functions :

class SlidingWindow {
  void addNumberToWindow(int num)
  boolean isFullWindow()
  void processFullWindow()
  int maxUnique()

This will at least break up the input processing logic from the state handling. Lines 16 - 24 would get folded into addNumberToWindow; Lines 26 - 40 would be inside processFullWindow.

Thursday, October 08, 2015

Cartesian product in Scala

val s  = Seq("1","2","3","4")
val t  = Seq("a","b","c","d","e","f")
val u  = Seq("x","y")
val v  = Seq("m","l","n","o","p")

val sq = Seq(s,t,u,v)

sq.foldLeft(Seq(""))((b,a) => b.flatMap{i=>{j=>i+j}})

res9: Seq[String] = List(1axm, 1axl, 1axn, 1axo, 1axp, 1aym, 1ayl, 1ayn, 1ayo, 1ayp, 1bxm, 1bxl, 1bxn, 1bxo, 1bxp, 1bym, 1byl, 1byn, 1byo, 1byp, 1cxm, 1cxl, 1cxn, 1cxo, 1cxp, 1cym, 1cyl, 1cyn, 1cyo, 1cyp, 1dxm, 1dxl, 1dxn, 1dxo, 1dxp, 1dym, 1dyl, 1dyn, 1dyo, 1dyp, 1exm, 1exl, 1exn, 1exo, 1exp, 1eym, 1eyl, 1eyn, 1eyo, 1eyp, 1fxm, 1fxl, 1fxn, 1fxo, 1fxp, 1fym, 1fyl, 1fyn, 1fyo, 1fyp, 2axm, 2axl, 2axn, 2axo, 2axp, 2aym, 2ayl, 2ayn, 2ayo, 2ayp, 2bxm, 2bxl, 2bxn, 2bxo, 2bxp, 2bym, 2byl, 2byn, 2byo, 2byp, 2cxm, 2cxl, 2cxn, 2cxo, 2cxp, 2cym, 2cyl, 2cyn, 2cyo, 2cyp, 2dxm, 2dxl, 2dxn, 2dxo, 2dxp, 2dym, 2dyl, 2dyn, 2dyo, 2dyp, 2exm, 2exl, 2exn, 2exo, 2exp, 2eym, 2eyl, 2eyn, 2eyo, 2eyp, 2fxm, 2fxl, 2fxn, 2fxo, 2fxp, 2fym, 2fyl, 2fyn, 2fyo, 2fyp, 3axm, 3axl, 3axn, 3axo, 3axp, 3aym, 3ayl, 3ayn, 3ayo...

Saturday, September 05, 2015

Scala : Another example of collectFirst

Say, we want to find the population of the native country of the first European bird in a list of birds. This can be done by flatmapping over the birds, thus ignoring the non-European birds, and then getting the head of the resulting list. But then, we have to go over each bird, although we need to just get the first bird. If the list is sufficiently large, this is very inefficient.

The other approach is to use find to get the first European bird, and then simply get its native country's population. But here we have to get the native country of the bird twice. In this example, that is cheap, but if this information requires an expensive database retrieval, then we may not want to do it twice. Where collectFirst becomes very important is when this operation is very expensive - imagine doing a statistical calculation that provides the value for the match.

case class Bird(name: String)

object PlayCollect {

  val birds = Seq(Bird("jaybird"), Bird("owl"), Bird("sparrow"))

  val nativeCountries  = Map( (Bird("jaybird")->"Malaysia"), (Bird("sparrow")->"Ireland"), (Bird("owl")->"Chile") )
  val continents = Map( ("Malaysia" -> "Asia"), ("Ireland" -> "Europe"), ("Chile" -> "South America") )
  val population = Map( ("Malaysia"->10000000), ("Ireland" -> 12000000), "Chile"->45000000 )

  val popEuBird = new PartialFunction[Bird, Int] {
    def apply(b: Bird) = {
      continents(nativeCountries(b)) match {
        case "Europe" => population(nativeCountries(b))
    def isDefinedAt(b: Bird) = {
      continents(nativeCountries(b)) match {
        case "Europe" => true
        case _ => false

  def populationOfNativeCountryOfFirstEuropeanBird: Option[Int] = {
    birds collectFirst popEuBird

  def main (args: Array[String]) {
    populationOfNativeCountryOfFirstEuropeanBird.foreach{pop => println(pop)}

Wednesday, September 02, 2015

Scala : collectFirst example

On occasion, you want to find the first occurrence of an item in a list, then transform it to another type. Using partial functions with collectFirst, we can accomplish this.

collectFirst will first call isDefinedAt to determine if apply should be called for the current item in the list. Thus it will skip over the non matching elements in the list without incurring a match exception.

trait Animal
case class Mammal(name: String) extends Animal
case class Bird(name: String) extends Animal

val animals = Seq(Mammal("elephant"), Mammal("tiger"), Bird("raven"), Mammal("monkey"), Bird("peacock"), Bird("sparrow"))

    val matchBird = new PartialFunction[Animal, Bird] {
      def apply(p: Animal) = {
        p match {
          case b@Bird(name) => b

      def isDefinedAt(p: Animal) = {
          p match {
            case Bird(name) => true
            case _ => false

    animals collectFirst matchBird

    res11: Option[Bird] = Some(Bird(raven))

Friday, February 27, 2015

Titan : using native hadoop libraries on MacOSX

Once you have built the native hadoop libraries on your MacOSX, you need to add this bit of code to bin/ so that it can find them:
if [ -e "${HADOOP_PREFIX}/lib/native/libhadoop.dylib" ]; then
   if [ -n "${LD_LIBRARY_PATH:-}" ]; then
       LD_LIBRARY_PATH="${LIB_PATH}:${LD_LIBRARY_PATH}"     # For Linux

   if [ -n "${DYLD_LIBRARY_PATH:-}" ]; then

The only oddity here is that the script uses "set -u" at the top, which makes bash complain if you use uninitialized variables. So you have to append ":-" to the variables that you are testing. You can see that in the lines that test LD_LIBRARY_PATH etc.