Go to your favorite search engine, and type “Big Data Architecture.” You’ll see over 100M results. Click on a few links. You should begin to notice that none of the sites describe the same architecture, and they all contain different tools. You will also notice that almost all claim to be the best, newest, and wave of the future. But how can that be if they are all different?
Keep searching. The trend will continue. One site will tell you that you must use a document store. Another will say that NoSQL is the way to go. Another may say that relational models are back with vengeance. You’ll read about some new unstructured data store, or you’ll read about some new distributed file system. You may even come across NewSQL claiming to provide NoSQL scale and speed while supporting ACID transactions. They are all different; yet still, they all claim to be the best and recommended for you.
So then how do you determine what is the best and right for you? First, know your data, know your software, and know your goals. Then, make an educated decision and pick something to start. Build a proof of concept (POC). Don’t sit and ponder for months or years reading through 100M search links trying to find a solution that supports all of your needs. You won’t find it out there. Why? Because you haven’t built it yet. Who, me? Yes, You! The tools exist out there. You just have to put them together.
None of the articles, blogs, or documents you will come across were designed with your data, your hardware, your skillsets, or your systems. Some will claim they can support all types, but nothing ever does. So take a chance! If you are a software developer or engineer, you already have the skills to build your own POC. Build it, and be proud!
Don’t be the person looking back after spending 6, 9, 12 months reading the same material over and over, only wishing you’d started something in the beginning. In hindsight, if you’d just chosen a solution 3 months into the process, you likely could have built and tested numerous systems already (all in the same amount of time you’d have spent reading and being indecisive).
Fortunately, at Benefitfocus, we have decided to act. We are learning, researching, and — most of all — building and moving forward. We aren’t afraid to try new technologies and are certainly not afraid to fail. If something fails during the POC, we replace it or develop our own solution. Why? Because we chose to act rather than wait for someone else to develop a solution and miraculously have their link appear in our next Google search for Big Data Architecture.
How can we survive with failure and constant change? It is simple. We learn from failure and have taken a componential design approach, allowing us to adapt. For the new Benefitfocus Data Cloud, we’ve adopted the Hadoop ecosystem comprised of dozens, if not hundreds, of compatible tools and software for storage, reliability, scalability, monitoring, auditing, processing, and any other need you may have. We are going to build our system piece by piece until we have our system built to own and to evolve. And if a tool doesn’t already exist, we build it.
So if you need a document store, add it. If you need a relational warehouse, add it. If you need a distributed file store, add it. If you need batch reporting or real time computation, create that layer. Make these tools work together; build to your specifications, evolve, and improve.
By adopting a componential approach, not only will your system be allowed to grow and evolve, but you will soon find that your engineers are turning into remarkable engineers. It forces you to learn what really makes a data system tick and why. You can’t just install Hadoop like most commercial data managements systems. You have to construct it piece by piece; and in doing so, you understand every last component and configuration thus becoming an expert in data architecture.
Remember, your best system is only as good as your weakest link. If your links are componential, you can replace and rebuild, striving for that optimum set of tools.
Big data is real and is growing at an exponential rate. Don’t spend all your time contemplating how to support it. Just do it. Take a chance and build, design, fail, and retry. Don’t let the big data opportunity pass you by.
By starting today and using componential design, you can build what is best today then continuously adapt your technology to what is best in the future. The opportunity is real, so own it. Be proud of your design and, most of all don’t be afraid to explore the unknown.