Estimating File Availability
In order to compare the different replication
algorithms discussed in [1],
we must find a tractable and accurate method to compute file
availability. A straightforward approach would be to divide the
observed file uptime over the total simulation time. However,
estimating file availability in this way has two disadvantages:
First, the observed uptime value includes transient effects;
specially when the simulation starts without any replication at
all. Second, perhaps more significantly, we cannot simulate a long
enough interval to observe a significant amount of downtime for
highly available nodes. This effect causes the observed (past)
file availability to be overly optimistic and noisy to predict
future performance.
In order to understand the different alternatives for
estimating file availability, Figure 1 (below) presents the CDFs
of several estimators obtained from the same arrangements on a commercial
(CO) and a file sharing (FS) environments. First, the E curve is
the full combinatorial estimator mentioned in Section
3. The full estimator computes all possible failure scenarios
leading to data loss and therefore it becomes less tractable as
the number of fragments increases. The BAM curve uses equation
1 from Section
3 combined with the measured availability of peers from the
simulation; this is the estimator used in real-life by peers to
drive the REP algorithm. The BAE estimator is like BAM but using
the expected availability of peers as drawn from the node
availability CDF. Finally, curve M is the measured availability
after letting the algorithm run for sufficient time, which has the
drawback mentioned above.
Using these results we can compare the different estimators.
Notice that BAM and BAE are pessimistic with respect to E, while M
is slightly optimistic. Since our simulations run for a fixed period
of time (6 simulated months on average) we need to limit the
estimated availability for files and nodes that were never
unavailable. As we use 4 nines to limit the observed availability
the results on Figure 1 are truncated at 4 and therefore we can
not see a bigger difference between E and M.
The important conclusion that we draw from these results is
that BAM, which is the only metric that can be computed in real time
to drive the algorithm, is very accurate for low availability levels
and always pessimistic with respect to other estimators.
|