A Scalable Data Chunk Similarity based Compression Approach for Efficient Big Sensing Data Processing on Cloud
Date: May 2017
Skills: Java, Hadoop, MySql.
We proposed a scalable data compression based on similarity calculation among the partitioned data chunks with Cloud computing. A similarity model was developed to generate the standard data chunks for compressing big data sets. Instead of compression over basic data units, the compression was conducted over partitioned data chunks. The MapReduce programming model was adopted for the algorithms implementation to achieve some extra scalability on Cloud.