ACM Transactions on Modeling and Performance Evaluation of Computing Systems, volume 4, issue 1, pages 1-32

A Quantitative and Comparative Study of Network-Level Efficiency for Cloud Storage Services

Publication typeJournal Article
Publication date2019-01-05
scimago Q2
SJR0.525
CiteScore2.1
Impact factor0.7
ISSN23763639, 23763647
Computer Science (miscellaneous)
Hardware and Architecture
Information Systems
Computer Networks and Communications
Software
Safety, Risk, Reliability and Quality
Media Technology
Abstract

Cloud storage services such as Dropbox and OneDrive provide users with a convenient and reliable way to store and share data from anywhere, on any device, and at any time. Their cornerstone is the data synchronization (sync) operation, which automatically maps the changes in users’ local file systems to the cloud via a series of network communications in a timely manner. Without careful design and implementation, however, the data sync mechanisms could generate overwhelming traffic, causing tremendous financial overhead and performance penalties to both service providers and end users. This article addresses a simple yet critical question: Is the current data sync traffic of cloud storage services efficiently used? We first define a novel metric TUE to quantify the T raffic U sage E fficiency of data synchronization. Then, by conducting comprehensive benchmark experiments and reverse engineering the data sync processes of eight widely used cloud storage services, we uncover their manifold practical endeavors for optimizing the TUE, including three intra-file approaches (compression, incremental sync, and interrupted transfer resumption), two cross-file/-user approaches ( i.e., deduplication and peer-assisted offloading), two batching approaches (file bundling and sync deferment), and two web-specific approaches (thumbnail views and dynamic content loading). Our measurement results reveal that a considerable portion of the data sync traffic is, in a sense, wasteful and can be effectively avoided or significantly reduced via carefully designed data sync mechanisms. Most importantly, our study not only offers practical, actionable guidance for providers to build more efficient, traffic-economic services, but also helps end users pick appropriate services that best fit their use cases and budgets.

Lim M.
Applied Sciences (Switzerland) scimago Q2 wos Q2 Open Access
2023-05-12 citations by CoLab: 0 PDF Abstract  
In this paper, we propose an active file mode change mechanism in which the file synchronization system of cloud storage automatically changes files in a directory of a client to the online or local mode by considering tradeoff between local storage usage and file access time according to directory activation ratio. When the directory activation ratio rises above a certain threshold, the proposed active file mode change mechanism selects online mode files in this directory based on file access delay time and local storage usage and changes them to the local mode to reduce file access delay of active IoT clients. When the directory activation ratio falls below the threshold, the active file mode change mechanism selects the local mode files based on the last access time and local storage usage and changes them to the online mode to increase available local storage. Experimental results show that the proposed active file mode change mechanism can control when and how much the client can reduce and increase the local storage usage and the file access delay by changing file mode parameters according to the requirements of various IoT devices.
Zhao M., Chen J., Li Z.
2021-12-01 citations by CoLab: 0 Abstract  
Cloud storage services (e.g., Dropbox) have become pervasive in not only simple file sharing but also advanced collaborative file editing (collaboration for short). Using Dropbox for collaboration is much easier than SVN and Git, thus greatly facilitating common users. In practice, however, many Dropbox users are perplexed by unexpected collaboration conflicts, which severely impair their experiences. Through various benchmark experiments, we unveil the two root causes of collaboration conflicts: 1) Dropbox never locks an edited file during collaboration; 2) Dropbox only guarantees eventual data consistency among the collaborators, significantly aggravating the probability of conflicts. In this paper, we attempt to enable conflict-free collaborations with Dropbox-like cloud storage services. This attempt is empowered by three key findings and measures. First, although the end-to-end sync delay is unpredictable due to eventual consistency, we can always track the latest version of an edited file by actively resorting to the cloud via certain web APIs. Second, although all application-level data is encrypted in Dropbox, we can roughly deduce the sync status from traffic statistics. Third, applying a couple of useful mechanisms (e.g., distributed architecture and data lock) learned from Git, we can effectively and efficiently avoid collaboration conflicts-of course, this requires re-implementing Git mechanisms in cloud storage services with minimum overhead and user interference. Integrating above efforts, we build the ConflictReaper system capable of helping users automatically avoid almost all collaboration conflicts with affordable network and computation overhead.
Gonçalves G.D., Drago I., Vieira A.B., da Silva A.P., de Almeida J.M.
2021-04-03 citations by CoLab: 1 Abstract  
Personal Cloud Storage (PCS) is a very popular Internet service. In addition to backup, PCS allows content sharing among multiple devices, which is a valuable functionality for many users. Yet maintaining a PCS service incurs both storage and bandwidth costs to service providers, as the update and successive downloads of shared files may generate extra traffic on the PCS cloud servers. This becomes particularly worrisome as a large fraction of users (e.g., over 90%) does not pay for the service, joining a free version with limited storage capacity but typically without content sharing restrictions. Thus, a natural concern arises on the costs and benefits of such service, for both provider and users. In this paper, we propose a model to analyze the cost-benefit tradeoffs of content sharing in PCS services for both parties. Our model uses a macroeconomic concept, notably user surplus, to capture the satisfaction of different classes of users, as well as a cost saving function to capture the interest of the provider in reducing bandwidth costs. We use our model to evaluate two alternative policies to the content sharing architecture in use in existing PCS services, searching for scenarios in which both parties have benefits. The proposed policies rely on incentives given to users in exchange of their participation to offload shared content from cloud servers. Our investigation, based on analytical modeling and data-driven experiments, shows that the incentive leads to greater satisfaction to both parties, and that the alternative policies can reach scenarios which benefit both provider and users, with reductions in provider’s bandwidth costs by 20% and increases in user satisfaction from 51 to 82% under reasonable model assumptions.
Shen Y., Du B., Xu W., Luo C., Wei B., Cui L., Wen H.
2020-04-09 citations by CoLab: 9 Abstract  
Since ancient Greece, handshaking has been commonly practiced between two people as a friendly gesture to express trust and respect, or form a mutual agreement. In this article, we show that such physical contact can be used to bootstrap secure cyber contact between the smart devices worn by users. The key observation is that during handshaking, although belonged to two different users, the two hands involved in the shaking events are often rigidly connected, and therefore exhibit very similar motion patterns. We propose a novel key generation system, which harvests motion data during user handshaking from the wrist-worn smart devices such as smartwatches or fitness bands, and exploits the matching motion patterns to generate symmetric keys on both parties. The generated keys can be then used to establish a secure communication channel for exchanging data between devices. This provides a much more natural and user-friendly alternative for many applications, e.g., exchanging/sharing contact details, friending on social networks, or even making payments, since it doesn’t involve extra bespoke hardware, nor require the users to perform pre-defined gestures. We implement the proposed key generation system on off-the-shelf smartwatches, and extensive evaluation shows that it can reliably generate 128-bit symmetric keys just after around 1s of handshaking (with success rate >99%), and is resilient to different types of attacks including impersonate mimicking attacks, impersonate passive attacks, or eavesdropping attacks. Specifically, for real-time impersonate mimicking attacks, in our experiments, the Equal Error Rate (EER) is only 1.6% on average. We also show that the proposed key generation system can be extremely lightweight and is able to run in-situ on the resource-constrained smartwatches without incurring excessive resource consumption.
Song B., Trachtenberg A.
2019-09-01 citations by CoLab: 0 Abstract  
We consider the problem of reconciling similar, but remote, strings with minimum communication complexity. This “string reconciliation” problem is a fundamental building block for a variety of networking applications, including those that maintain large-scale distributed networks and perform remote file synchronization. We present the novel Recursive Content-Dependent Shingling (RCDS) protocol that is computationally practical for large strings and scales linearly with the edit distance between the remote strings. We provide comparisons to the performance of rsync, one of the most popular file synchronization tools in active use. Our experiments show that, with minimal engineering, RCDS outperforms the heavily optimized rsync in reconciling release revisions for about 51% of the 5000 top starred git repositories on GitHub. The improvement is particularly evident for repositories that see frequent, but small, updates.
Chen Z., He Q., Mao Z., Chung H., Maharjan S.
2019-05-17 citations by CoLab: 53 Abstract  
Douyin, internationally known as TikTok, has become one of the most successful short-video platforms. To maintain its popularity, Douyin has to provide better Quality of Experience (QoE) to its growing user base. Understanding the characteristics of Douyin videos is thus critical to its service improvement and system design. In this paper, we present an initial study on the fundamental characteristics of Douyin videos based on a dataset of over 260 thousand short videos collected across three months. The characteristics of Douyin videos are found to be significantly different from traditional online videos, ranging from video bitrate, size, to popularity. In particular, the distributions of the bitrate and size of videos follow Weibull distribution. We further observe that the most popular Douyin videos follow Zifp's law on video popularity, but rest of the videos do not. We also investigate the correlation between popularity metrics used for Douyin videos. It is found that the correlation between the number of views and the number of likes are strong, while other correlations are relatively low. Finally, by using a case study, we demonstrate that the above findings can provide important guidance on designing an efficient edge caching system.
Li Z., Dai Y., Chen G., Liu Y.
2012-02-24 citations by CoLab: 0 Abstract  
This chapter discusses important emerging techniques and the possible future work on Internet content distribution.

Top-30

Journals

1
1

Publishers

1
1
  • We do not take into account publications without a DOI.
  • Statistics recalculated only for publications connected to researchers, organizations and labs registered on the platform.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Share
Cite this
GOST | RIS | BibTex | MLA
Found error?