Usetutoringspotscode to get 8% OFF on your first order!

  • time icon24/7 online - support@tutoringspots.com
  • phone icon1-316-444-1378 or 44-141-628-6690
  • login iconLogin

Semantic Caching Demo on SparkSQL and HDFS

Overview: A caching system management used in SparkSQL/Apache Spark and HDFS

Requirements: SparkSQL, ApacheSpark, HDFS, Data caching algorithm and English skills.

Motivation: We are now using the HDFS to store and manage the data. And we do also the data analytics using SparkSQL in Apache Spark.

Normally, the Application Driver of Apache Spark will load the distributed data in HDFS into memory and do the processing. Finally, Spark will return the results to the HDFS. Sometime, the results we got from previous queries could be used again once or more time by next queries. Then, we want to cache these result in our memory long enough by using a mechanism of caching, it called semantic caching.

What we want is: an implementation of semantic caching program in Apache Spark. The program should be done by Scala/Java language but Scala is preferable.

You can leave a response, or trackback from your own site.

Leave a Reply

Powered by WordPress | Designed by: Premium WordPress Themes | Thanks to Themes Gallery, Bromoney and Wordpress Themes