Smart Cloud Caching for Data Intensive Applications
As Cloud computing is gaining popularity among small and medium enterprises, Cloud storage solutions such as Amazon S3 are increasingly utilized for storing, maintaining, and serving application data. Despite the typical high-speed internet connections between applications and Cloud storage, there is still a huge performance gap compared to accessing data from direct-attached memory or even locally attached disks. SMACC is a novel Cloud caching service developed at CUT that can run on application compute nodes (e.g., on Amazon EC2) and cache frequently-used data residing on Amazon S3 into local memory and locally-attached disks (e.g., Amazon EBS) using new smart policies. SMACC also provides an HDFS-compatible API interface, which can be used by big data platforms such as Spark and Hadoop for processing data residing on Amazon S3, while caching data blocks on the various compute nodes for increased performance.