Gumana Series – Understanding Fabric’s Ordering

Introduction to Gumana

Gumana is a portmanteau of 금요일 (Friday) 만나다 (Meeting), a series of technical talk every Friday at Medium

About this talk

A visual explanation of Kafka’s Ordering Design, why does it need a queue, kafka, synchronization between Orderer Nodes

Presentation slide

Presentation video

Medium interview questions

Based on Korean Version 1.9.6 (https://hamait.tistory.com/1054), last updated October 2019.

You are not expected to be 100% knowledgeable about those, but instead show your depth in understanding what you have experience with or is interested in. Focus on what you know well.

Part 1: Blockchain

What are Double spending, Replay attack, Eclipse attack

Part 2: Bitcoin

  • How can we ensure the integrity of Bitcoin transactions? How do you trust the previous output in the input of the next transaction?
  • What’s bloom filter SPV in Bitcoin?

Part 3: Ethereum

  • What’s the difference between Transaction and Raw Trasaction?
  • What’s nonce in an Ethereum transaction? Why is there no nonce in Bitcoin?

Part 4: Hyperledger fabric

  • Explain the transaction flow of Hyperledger fabric
  • What is MVCC Collision and Optimistic Lock on Hyperledger Fabric?
  • What is MSP in Hyperledger Fabric
  • What are channel MSPs and network MSPs in a Hyperledger fabric?
  • What’s nonce in Hyperledger fabric. What is the difference with Ethereum’s?
  • How events are created in Hyperledger fabric, how can the client know about an event?

Part 5: EOS

Part 6: Hyperledger Indy

Part 7: Consensus

  • What are the advantages and disadvantages of the E-O-V consensus process in Hyperledger Fabric?

Part 8: Software

  • Tell us about three design patterns you usually use. Write the pseudo-code Implementation of Observer Pattern
  • Implement pseudo-code to distribute work among multiple threads and wait for them to finish
  • Give me three examples of how to waste space (memory) to improve performance
  • What is padding, packing in memory alignment?

Part 9: Java

  • Explain Java’s method argument passing method. What’s Shallow Copy / Deep Copy.
  • What is the logic error of the following servlet call code (target is exected once after passing through filters)?
//// Filter 
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {

        ...  인코딩처리 OR
        ...  로깅처리 OR
        ...  인증처리

        chain.doFilter(request, response);
     
        ...  
}

//// FilterChain

public class FilterChain { 
   private List filters = new ArrayList(); 
   private Target target; 
   
   int currentFilter = 0; 

   public void addFilter(Filter filter){ 
      filters.add(filter); 
   } 

   public Filter getNextFilter(){ 
      if(currentFilter < filters.size()){ 
           return filters.get(currentFilter++); 
      } 
      return null; 
   } 
   public void doFilter(String request, String response){ 
         Filter f = getNextFillter(); 
         if(f != null){  
           f.doFilter(request,response,this);         
         } 
           
         target.execute(request,response); 
   } 

   public void setTarget(Target target){ 
      this.target = target; 
   } 
} 

Part 10: C++

  • What’s important about performance degradation and enhancements in C ++
  • Parse a single line to collect space-separated words, then code them to print out the word and the number of duplicates. (Performance and memory optimization
  • Write a function that takes a string as a parameter and returns a string with certain characters removed. (With performance optimization)
  • Briefly describe the auto / override / nullptr / constexpr / atomic keywords in C ++
  • Please explain the following code in C ++. (Consumers in the producer-consumer pattern, and there is only one consumer here)
Buffer BufferPool::get_buf(){   
   Buffer* buf = nullptr;
   std::unique_lock<std::mutex> ul(_mtx, std::defer_lock);

   while (buf == nullptr){
    ul.lock();
    if (_pool.empty()) _cond.wait(ul);

    if (!_pool.empty())  // 여기서 pool 이 empty 일 경우는?
    {
       buf = _pool.get();
    }
  }

   .... DO something ....
  return buf;
}

Part 11: Go

  • How is the select statement used in Go? Please explain the code below.
package main

import (
   "fmt"
   "time"
)

var scheduler chan string

func consuming (prompt string){
      fmt.Println("consuming 호출됨")
   select {
   case scheduler <- prompt:
      fmt.Println("이름을 입력받았습니다 : ", <- scheduler)
   case <-time.After(5 * time.Second):
      fmt.Println("시간이 지났습니다.")
   }
}

func producing (console chan string) {
   var name string
   fmt.Print("이름:")
   fmt.Scanln(&name)
   console <- name
}
func main() {
   console := make(chan string, 1)
   scheduler = make(chan string, 1)

   go func(){
      consuming(<-console)
   }()

   go producing(console)

   time.Sleep(100 * time.Second)
}

Part 12: Javascript

  • What are built-in Javascript objects / browser objects / HTML DOM objects
  • What is the difference between ajax and websocket communication
  • Show your previous works in React & CSS Styling

Part 13: Distributed systems

  • What is consistent hashing?
  • What is HAProxy?
  • What is Zookeeper and give two examples where you should use it

Part 14: Compilers

  • How does EOS charge for resources?
  • How to compute CPU, Memory and Storage usage in a program written in C ++ or Go?

Part 15: Cryptography

  • What is HMAC / PKI / ECDSA / ECDH
  • What is ECert in Hyperledger Fabric? Why does Hyperledger fabric use it?
  • How are zero knowledge proofs used in Fabric Identity Mixer?

Part 16: Database

  • Compare Red Black tree & B tree & Skip lists data structures.

Part 17: Messaging

Part 18: Networking / Socket

  • Tell me as much as you know the difference between socket communication between multithreaded / Select / Java NIO / ePoll / IOCP.

Experience with the following tools

  • Agile Management Techniques (* JIRA)
  • Product & Configuration Management (Bitbucket)
  • Containerization like Docker + Coobernate
  • Build Automation (* Bamboo)
  • Test Automation (* Unit Test gTest Study)
  • Issue Registration Automation (* JIRA)
  • Information sharing wiki management (confluence)
  • Information sharing chat management (slack)
  • Deployment Automation
  • Service Management Automation
  • Understanding Your Networking Infrastructure
  • Understanding Vertical / Horizontal Segmentation
  • Understanding and building a non-stop system (extending non-stop resources, etc.)
  • AWS Management

How to install nvm

This is a follow up to [How to install npm the right way]. It turns out that while convenient for Node development, nvm is notoriously slow. Thanks to reddit user sscotth we can solve that quite easily.

First, install nvm normally

curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.34.0/install.sh | bash

Then find the lines nvm added to your .rc file (bashrc or zshrc), delete that shit

# export NVM_DIR="$HOME/.nvm"
# [ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh"  # This loads nvm
# [ -s "$NVM_DIR/bash_completion" ] && \. "$NVM_DIR/bash_completion"  # This loads nvm bash_completion

Next, add this to your .rc file

declare -a NODE_GLOBALS=(`find ~/.nvm/versions/node -maxdepth 3 -type l -wholename '*/bin/*' | xargs -n1 basename | sort | uniq`)

NODE_GLOBALS+=("node")
NODE_GLOBALS+=("nvm")

load_nvm () {
    export NVM_DIR=~/.nvm
    [ -s "$NVM_DIR/nvm.sh" ] && . "$NVM_DIR/nvm.sh"
}

for cmd in "${NODE_GLOBALS[@]}"; do
    eval "${cmd}(){ unset -f ${NODE_GLOBALS}; load_nvm; ${cmd} \$@ }"
done

Your globally installed programs like create-react-app will still use the current version of node, while it only loads once and not boggle down your terminal startup everytime.

Win-win

Benchmark Flatbuffer / Protobuffer / C++ Struct performance

I’m building a project that requires maximal performance. But it also uses many data structure that need some flexibility. Naturally Flatbuffer and Protobuffer are potential candidates. So I did some benchmark. Here’s the result

Serialization / de-serialization

=================================
Raw structs bench start...
total = 15860948312882282624
Raw structs bench: 312 wire size, 210 compressed wire size
* 0.049046 encode time, 0.003316 decode time
* 0.067384 use time, 0.003062 dealloc time
* 0.073762 decode/use/dealloc
=================================
FlatBuffers bench start...
total = 15860948312882282624
FlatBuffers bench: 344 wire size, 220 compressed wire size
* 2.920470 encode time, 0.003350 decode time
* 0.740325 use time, 0.003186 dealloc time
* 0.746861 decode/use/dealloc
=================================
Protocol Buffers LITE bench start...
total = 15860948312882282624
Protocol Buffers LITE bench: 228 wire size, 174 compressed wire size
* 3.268726 encode time, 3.188535 decode time
* 0.200258 use time, 0.399030 dealloc time
* 3.787823 decode/use/dealloc

Include network communication

=================================
FLATBUF bench start...
total bytes = 15898507595776707224
* 0.003065 create time
* 0.238328 receive time
* 0.002950 use
* 0.000782 free
* 0.245125 total time
=================================
PROTOBUF bench start...
total bytes = 0
* 0.000766 create time
* 0.244944 receive time
* 0.001007 use
* 0.000785 free
* 0.247503 total time
=================================
RAW bench start...
total bytes = 54377074000
* 0.001709 create time
* 0.002417 receive time
* 0.000813 use
* 0.000759 free
* 0.005699 total time

The conclusion

  • While flatbuffer / protobuffer provides a convenient API to define data structures, have them dynamically expanded and support a variety of languages, they are slower than just using raw structures
  • While flatbuffer is faster than protobuffer at pure serialization / deserialization, the difference is minimal when accounting for remote RPC costs
  • We need to test more recent libraries for serialization, and potentially combine them with the custom RPC model we are having with EVPP: YAS, cap’n’proto

The journey

It grinds my gears when code doesn’t work
  • Contrary to popular belief, Google’s code does not always work
  • The gRPC example in flatbuffer is outdated and is not working
  • The benchmark that proves flatbuffer is faster than protobuf is from 2016 and no longer compiles with the latest libraries

I fixed the above problems in https://github.com/thanhphu/flatbuffers

  • The latest flatbuffer no longer work recent versions of gRPC due to some abstractions in data structure
  • I need to modify gRPC to expose legacy data structures that flatbuffer needs access to. Thus this repository is born https://github.com/thanhphu/grpc
  • I need to merge some recent contributions that solve the problem but did not confirm to Google’s code standard in order to make flatbuffer work

Finally, I need to write benchmark code for all three (Flatbuffer + gRPC, Protobuf + gRPC, raw struct + EVPP). The complete code is available here

https://github.com/thanhphu/buffer-bench

Tags you can use

  • v1.1: Serialization / deserialization only, runs on one machine
  • v2.0: Serialization + deserialization + network transmission, can be run on two machines

Note that if you have already installed another version of gRPC and/or protobuf, you need to remove them with

$ sudo rm -f /usr/local/bin/*grpc*
$ sudo rm -f /usr/local/bin/protoc
$ sudo rm -f /usr/local/lib/*gpr*
$ sudo rm -f /usr/local/lib/*grpc*
$ sudo rm -f /usr/local/lib/*protobuf*
$ sudo rm -f /usr/local/lib/*protoc*
$ sudo rm -f /usr/local/lib/pkgconfig/*gpr*
$ sudo rm -f /usr/local/lib/pkgconfig/*grpc*
$ sudo rm -f /usr/local/lib/pkgconfig/*protobuf*
$ sudo rm -rf /usr/local/include/google
$ sudo rm -rf /usr/local/include/grpc
$ sudo rm -rf /usr/local/include/grpc++
$ sudo rm -rf /usr/local/include/grpcpp

How to write a C MQTT client using Mosquitto

Introduction

How to write a C MQTT client using Mosquitto The 2018 version, based upon this excellent post by Kevin Boone:

Writing an MQTT client C for ActiveMQ from the ground up

The article above is a good and easy starting point, but it hasn’t been updated for 2 years so when you run it with the latest version of Mosquitto, it doesn’t work – and it’s a bit hacky (using “sleep” to avoid a concurrency problem).

So I analyzed the latest mosquitto_pub code from mosquitto repository itself to see how it’s working, and this article is the result.

What’s changed

  • There’s a queue inside mosquitto, `mosquitto_loop` must be called for it to be processed. Alternatively, you can also use the `mosquitto_loop_start` and `mosquitto_loop_stop`
  • I added asynchronous (callback) processing to wait for calls to complete, instead of the ole’ sleep function
  • It’s 2018! Everyone is adopting HTTPS. Accordingly, your MQTT traffic shouldn’t be left bare for all to see! Let’s use TLS to encrypt the traffic

The code

How to use callback

I want to publish just once message, so my flow is the following

  • On connect complete -> publish a message
  • On publish complete -> start to disconnect
  • On disconnect complete -> exit the loop and return control to the main thread. If you don’t wait for this, data may not even get sent!

To do this, I set up 3 “hooks” (callback function), like this

mosquitto_connect_callback_set(mosq, my_connect_callback);

mosquitto_disconnect_callback_set(mosq, my_disconnect_callback);

mosquitto_publish_callback_set(mosq, my_publish_callback);
And then write the callback functions to execute my flow
void my_connect_callback(struct mosquitto *mosq, void *obj, int result)
{
    int rc = MOSQ_ERR_SUCCESS;
    if(!result){
        printf("Sending message...\n");
        rc = mosquitto_publish(mosq, &mid_sent, MQTT_TOPIC, strlen(text), text, qos, retain);
        if(rc){
            switch(rc){
                case MOSQ_ERR_INVAL:
                    fprintf(stderr, "Error: Invalid input. Does your topic contain '+' or '#'?\n");
                    break;
                case MOSQ_ERR_NOMEM:
                    fprintf(stderr, "Error: Out of memory when trying to publish message.\n");
                    break;
                case MOSQ_ERR_NO_CONN:
                    fprintf(stderr, "Error: Client not connected when trying to publish.\n");
                    break;
                case MOSQ_ERR_PROTOCOL:
                    fprintf(stderr, "Error: Protocol error when communicating with broker.\n");
                    break;
                case MOSQ_ERR_PAYLOAD_SIZE:
                    fprintf(stderr, "Error: Message payload is too large.\n");
                    break;
            }
            mosquitto_disconnect(mosq);
        }
    } else {
        if(result){
            fprintf(stderr, "%s\n", mosquitto_connack_string(result));
        }
    }
}

void my_disconnect_callback(struct mosquitto *mosq, void *obj, int rc)
{
    printf("Disconnected!\n");
    connected = false;
}

void my_publish_callback(struct mosquitto *mosq, void *obj, int mid)
{
    printf("Published!\n");
    if(disconnect_sent == false){
        mosquitto_disconnect(mosq);
        disconnect_sent = true;
    }
}

How to process the queue with mosquitto_loop

In the main trunk of your code, do this

int rc;
do {
//network동작 끝나기 전에 모스키토 동작을 막기위해 잠깐 딜레이가 필요
  rc = mosquitto_loop(mosq, -1, 1);
} while (rc == MOSQ_ERR_SUCCESS && connected);

How to add TLS to the connection process

Before connecting, set TLS options with mosquitto_tls_set

mosquitto_username_pw_set(mosq, MQTT_USERNAME, MQTT_PASSWORD);
mosquitto_tls_set(mosq, "ca-cert.pem", NULL, NULL, NULL, NULL);
int ret = mosquitto_connect(mosq, MQTT_HOSTNAME, MQTT_PORT, 0);

Complete publish – subscribe sample

Available at https://github.com/thanhphu/mosquitto-sample. Happy cloning!