The Evolution of Fault Tolerant Redis Cluster

This is a post by Hongche Liu and Jurgen Philippaerts from the Personalization and Ops Teams at BloomReach. At BloomReach, we use Redis, an open source advanced key-value cache and store, which is often referred to as a data structure server since values can contain strings, hashes, lists, sets, sorted sets, bitmaps and hyperloglogs. In one … 

 

Solr Compute Cloud – An Elastic Solr Infrastructure

This is a post by Nitin Sharma and Li Ding, Engineers from the Search and Data Infrastructure Team at BloomReach. Scaling a multi-tenant search platform that has high availability while maintaining low latency is a hard problem to solve.  It’s especially hard when the platform is running a heterogeneous workload on hundreds of millions of … 

 

Mapreduce Fun: Sampling for Large Data Set

This post is by Chou-han Yang, principal engineer at BloomReach. The coolest thing about mapreduce is that we suddenly have enormous computing power and storage at disposal. To me, it’s like a kid who suddenly has a new toy and a desire to incorporate it into his favorite games. What could be more fun than … 

 

Strategies for Reducing Your Amazon EMR Costs

This post is by Prateek Gupta, a lead engineer at BloomReach. It is also cross-posted on the AWS Big Data Blog. BloomReach has built a personalized discovery platform with applications for organic search, site search, content marketing and merchandizing. BloomReach ingests data from a variety of sources such as merchant inventory feed, sitefetch data from merchants’ websites … 

 

Open Source at Bloomreach

BloomReach benefits enormously from open source software throughout our data processing and serving systems. Our backend data processing and analytics systems use Hadoop, Cassandra and a myriad of libraries from the Apache and Python projects and other communities — and of course Linux. While the bulk of our code is tightly linked to our data …