Wrap up performance evaluation (#122)

* memory consumption profiling for 10e6 x 10 case * Added elapsed time and memory consumption measurements * address of comments
input-output-hk · Mar 9, 2020 · e29802d · e29802d
1 parent 736947e
commit e29802d
Show file tree

Hide file tree

Showing 5 changed files with 183 additions and 20 deletions.
diff --git a/executable-spec/bench/micro-benchmarking/Main.hs b/executable-spec/bench/micro-benchmarking/Main.hs
@@ -13,18 +13,18 @@ main :: IO ()
 main = do
   let
       !tallyData2 =
-        createTallyData constants (NumberOfParticipants 10)      (NumberOfConcurrentUPs 10)
+        createTallyData constants (NumberOfParticipants 100)      (NumberOfConcurrentUPs 1)
       !tallyData3 =
-        createTallyData constants (NumberOfParticipants 100)     (NumberOfConcurrentUPs 10)
+        createTallyData constants (NumberOfParticipants 1000)     (NumberOfConcurrentUPs 1)
       !tallyData4 =
-        createTallyData constants (NumberOfParticipants 1000)    (NumberOfConcurrentUPs 10)
+        createTallyData constants (NumberOfParticipants 10000)    (NumberOfConcurrentUPs 1)
       !tallyData5 =
-        createTallyData constants (NumberOfParticipants 10000)   (NumberOfConcurrentUPs 10)
+        createTallyData constants (NumberOfParticipants 100000)   (NumberOfConcurrentUPs 1)
       !tallyData6 =
-        createTallyData constants (NumberOfParticipants 100000)  (NumberOfConcurrentUPs 10)
+        createTallyData constants (NumberOfParticipants 1000000)  (NumberOfConcurrentUPs 1)
       !tallyData7 =
-        createTallyData constants (NumberOfParticipants 1000000) (NumberOfConcurrentUPs 10)
-  print $ runTally constants tallyData3
+        createTallyData constants (NumberOfParticipants 10000000) (NumberOfConcurrentUPs 1)
+  print $ runTally constants tallyData2
   Cr.defaultMain
     [ Cr.bgroup "tally" [ Cr.bench "1e2" $ Cr.whnf allApproved tallyData2
                         , Cr.bench "1e3" $ Cr.whnf allApproved tallyData3

diff --git a/formal-spec/decentralized-updates.tex b/formal-spec/decentralized-updates.tex
@@ -29,6 +29,7 @@
 \usepackage{pgfplots}
 \usepackage{tikz}
 \usepackage{tikz}
+\usepackage{hyperref}
 \usetikzlibrary{arrows,automata, decorations.pathreplacing, positioning, arrows.meta, calc, shapes}
 % For drawing simple diagrams involving arrows between LaTeX symbols
 \usepackage{tikz-cd}
@@ -50,6 +51,28 @@
 
 \newcommand{\nnote}[1]{{\color{blue}\small Nikos: #1}}
 
+
+\lstdefinestyle{mystyle}{
+%    backgroundcolor=\color{backcolour},   
+    commentstyle=\color{codegreen},
+    keywordstyle=\color{blue},
+%    numberstyle=\tiny\color{codegray},
+    stringstyle=\color{codepurple},
+    basicstyle=\ttfamily\footnotesize,
+    breakatwhitespace=false,         
+    breaklines=true,                 
+    captionpos=b,                    
+    keepspaces=true,                 
+    numbers=left,   
+    numbersep=5pt,                  
+    showspaces=false,                
+    showstringspaces=false,
+    showtabs=false,                  
+    tabsize=2
+}
+
+\lstset{style=mystyle}
+
 \begin{document}
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

diff --git a/formal-spec/measurements.tex b/formal-spec/measurements.tex
@@ -2,6 +2,8 @@
 \section{Impact on performance}
 \label{sec:impact-on-performance}
 
+\subsection{Transaction Throughput}
+
 An important measurement of a blockchain system's performance is the number of
 transaction bytes per-second ($\mathit{TBPS}$) it can sustain. Unlike the more
 commonly used metric, transactions per-second, this number is not dependent on
@@ -141,8 +143,8 @@ \section{Impact on performance}
       ylabel={Usage percentage},
       ymin=0.0,
       ymax=250.0,
-      ymode=linear,
-      ytick={5.0, 20.0, 50.0, 100.0, 150.0, 200.0, 250.0},
+      ymode=log, 
+      ytick={0.001, 0.01, 0.1, 1, 10, 100, 1000},
       legend pos=north west,
       ymajorgrids=true,
       grid style=dashed,
@@ -155,25 +157,151 @@ \section{Impact on performance}
   \label{fig:usage-vs-participants}
 \end{figure}
 
-We can see that the impact on the system's performance is negligible even when
-we consider $100,000$ participants.
+We can see that the impact on the system's performance is negligible
+even when we consider $100,000$ participants. Moreover, we see that 
+the usage consumption percentage scales linearly in the number of 
+participants, i.e., a 10 times increase in the number of participants 
+will only increase 10 times the required usage percent. Also, if we 
+double the throughput, then we can process a double amount of 
+participants workload at the same time.
 %
-These results indicate that the update protocol will start degrading the system
-performance \emph{only} past the 1,000,000 participants. Although this will
-require that the worst case conditions being met: 10 SIP's being voted at the
-same time over the period of 7 days, where each participant votes twice.
+These results indicate that the update protocol will start degrading
+the system performance \emph{only} past the 1,000,000 participants. 
+Although this will require that the worst case conditions being met:
+ 10 SIP's being voted at the same time over the period of 7 days, 
+where each participant votes twice.
 %
-In such case, relying on \emph{voting pools} (or \emph{expert pools}) becomes of
-crucial importance. In this way delegation of voting rights can help the update
-system is to scale beyond this number of participants. Alternatively, by
-increasing the duration of the vote period the impact on the system's
-performance can be mitigated.
+In such case, relying on \emph{voting pools} (or \emph{expert pools}) 
+becomes of crucial importance. In this way delegation of voting rights 
+can help the update system is to scale beyond this number of 
+participants. Alternatively, by increasing the duration of the vote 
+period the impact on the system's performance can be mitigated.
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 %% TODO: we need the revisit the section below considering the insight of
 %% the section above.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 
+\subsection{Processing Time and Memory Consumption}
+Clearly the most processing intesive task of the update mechanism 
+is the tally phase. It is the phase where all the collected votes 
+are counted in order to reach at a decision for a specific proposal. 
+
+We start with a theoretical time complexity analysis where we 
+assume a worst-case scenario, where we have $n$ participants that all
+of them vote by submiting a single vote. Also we assume that we have
+a single proposal, so that within a voting period, the number $n$ of participants coincides to the number of submitted ballots.
+
+In the following we try to break up the operations during the tally phase.
+ In the heart of the tally phase lies the following function call, 
+ which is called for each proposal.
+
+\begin{lstlisting}[language=Haskell, caption=Tally phase initial function call]
+tallyStake confidence result ballot stakeDistribution adversarialStakeRatio =
+  if stakeThreshold adversarialStakeRatio (totalStake stakeDistribution)
+     <
+     stakeOfKeys votingKeys stakeDistribution
+  then Just result
+  else Nothing
+  where
+    votingKeys = Map.filter (== confidence) ballot
+\end{lstlisting}
+
+\lstinline{Map.filter} \footnote{\href{url}{http://hackage.haskell.org/package/containers-0.6.2.1/docs/Data-Map-Strict.html\#g:25}}, is $O(n)$ so \lstinline{votingKeys} is $O(n)$, where $n$ is the number of ballots, 
+which as we have said coincides to the number of participants. At 
+this point, we have a single pass (loop) over $n$ ballots.
+
+Furthermore, \lstinline{stakeOfKeys} makes the following calls:
+\begin{lstlisting}[language=Haskell, caption=Code example]
+stakeOfKeys
+  keyMap
+  StakeDistribution
+  { stakeMap
+  }
+  = Map.foldl (+) 0 $ stakeMap `Map.intersection` keyMap
+\end{lstlisting}
+
+The $intersection$ function in the worst-case is $O(n)$\footnote{\href{url}{http://hackage.haskell.org/package/containers-0.6.2.1/docs/Data-Map-Strict.html\#v:intersection}}. Therefore this is a second pass
+ (loop) over the data of length $n$.
+\lstinline{foldl} is also $O(n)$\footnote{\href{url}{http://hackage.haskell.org/package/containers-0.6.2.1/docs/Data-Map-Strict.html\#v:foldl}}. This is a third pass (loop) over the data of 
+length $n$. Thus from the above analysis we see that we have for 
+a single proposal a call of \lstinline{tallyStake}, where in each
+ such call we have three passes over the data of length $n$. So in total 
+ for a single proposal we do $3$ passes over the data of length $n$. 
+ That is $3n$ operations, which means that the tally time 
+ complexity is $O(n)$. 
+
+This result is also confirmed by the experimental evaluation 
+shown in the graph below:
+
+\begin{figure}[htp]
+  \centering
+
+  \begin{tikzpicture}
+    \begin{axis}[
+      title={Elapsed time (sec) vs Number of participants},
+      xlabel={Participants},
+      xmin=100.0,
+      xmax=1.0e7,
+      xmode=log,
+      xtick={10.0, 100.0, 1000.0, 10000.0, 100000.0, 1000000.0, 1.0e7},
+      ylabel={Elapsed time (sec)},
+      ymin=0.0000001,
+      ymax=10,
+      ymode=log, 
+      ytick={0.0000001, 0.000001, 0.00001, 0.0001, 0.001, 0.01, 0.1, 1, 10},
+      legend pos=north west,
+      ymajorgrids=true,
+      grid style=dashed,
+      ]
+      \addplot[color=black] table {participants-vs-elapsed_time.dat};
+    \end{axis}
+  \end{tikzpicture}
+
+  \caption{Worst case scenario analysis for tally phase processing time}
+  \label{fig:eltime-vs-participants}
+\end{figure}
+
+In Figure \ref{fig:eltime-vs-participants}, we see that the elapsed 
+time increases linearly in the number of participants. In addition, 
+we see that it takes almost one tenth of a second to process
+ the votes of $1$ million participants. These results correspond to 
+ a no-parallel execution of the tally algorithm on a i7 CPU laptop 
+ with 32GB of RAM.
+
+Finally, we present the measurments of the memory consumption 
+during the tally phase. Again as the graph in 
+Figure \ref{fig:memcons-vs-participants} shows, the memory 
+allocated scales linearly in the number of participants.
+
+\begin{figure}[htp]
+  \centering
+
+  \begin{tikzpicture}
+    \begin{axis}[
+      title={Memory consumed (bytes) vs Number of participants},
+      xlabel={Participants},
+      xmin=100.0,
+      xmax=1.0e7,
+      xmode=log,
+      xtick={10.0, 100.0, 1000.0, 10000.0, 100000.0, 1000000.0, 1.0e7},
+      ylabel={Memory consumed (bytes)},
+      ymin=1.0e6,
+      ymax=1.0e12,
+      ymode=log, 
+      ytick={1.0e6, 1.0e7, 1.0e8, 1.0e9, 1.0e10, 1.0e11, 1.0e12},
+      legend pos=north west,
+      ymajorgrids=true,
+      grid style=dashed,
+      ]
+      \addplot[color=black] table {participants-vs-memory_consumed.dat};
+    \end{axis}
+  \end{tikzpicture}
+
+  \caption{Worst case scenario analysis for tally phase consumed memory}
+  \label{fig:memcons-vs-participants}
+\end{figure}
+
 \section{Measurements specification} \label{sec:measurements}
 
 In this section we want to describe an experimental evaluation of our proposed

diff --git a/formal-spec/participants-vs-elapsed_time.dat b/formal-spec/participants-vs-elapsed_time.dat
@@ -0,0 +1,6 @@
+1.00E+02	4.01E-06
+1.00E+03	4.23E-05
+1.00E+04	4.53E-04
+1.00E+05	6.70E-03
+1.00E+06	7.50E-02
+1.00E+07	2.92E+00
diff --git a/formal-spec/participants-vs-memory_consumed.dat b/formal-spec/participants-vs-memory_consumed.dat
@@ -0,0 +1,6 @@
+1.00E+02	3.22E+06
+1.00E+03	3.15E+07
+1.00E+04	3.18E+08
+1.00E+05	3.22E+09
+1.00E+06	3.25E+10
+1.00E+07	3.27E+11