CIOReview
| | August 20179CIOReviewmodel for data analytics. As a result, popular platforms like Hadoop and Spark are now widely supported and increasingly common in enterprises. HPC systems can run analytics workloads, but data analytics platforms are not well suited to some other kinds of computation. And similar to HPC, power "users" for analytics are really developing software on top of these specialized software stacks. Building an analytics team is yet another investment in specialized developers. Where HPC software teams need domain scientists who understand the phenomenology being simulated, analytics teams need experts in the data sources (be they measurements or customer transactions). Data have artifacts that have to be understood and propagated through the analytical process or you will get answers that misinterpret the data. The mechanics of acquiring, transforming, and loading data into an analytical environment take up more time and energy than developing the first statistic to be computed on the data. A smart analyst or data scientist can be stymied by unapproachable data. Machine Learning software has recently become approachable enough to be added to the analytics practitioner's toolbox. Machine Learning takes us beyond simple statistics to let the computer identify patterns--even high-dimensional patterns that you may not be able to see yourself. It is not a panacea and cannot extrapolate into unforeseen circumstances, but can do things like match new data against previously labeled categories. The underlying algorithms are powerful, but the practitioner needs to understand the technical capabilities of those algorithms to use them effectively.Moving forward, enterprises can look to cloud vendors to manage the hardware and supporting infrastructure of all but the highest-performance computing. This can eliminate the capital expenditures, allow for experimental adoption, and be more cost-effective than keeping around a platform that isn't heavily used. Platform as a Service providers can even maintain the complicated software stacks for you. But what you can't buy in the cloud is results that matter to your business. For that, you need people.The one consistent prerequisite for ongoing investment is talent. Build a pipeline of creative people who like to ask the questions that haven't been asked before, are self-motivated to keep pace with rapidly changing technology, and understand your business enough to ask the right questions. but often lag behind in their use of contemporary software development practices and tools. Many HPC programmers may be more engrained in their disciplines than in the fast-moving world of technology startups, scrum masters, and social coding. Provide them with contemporary, collaborative software development platforms such as GitHub, GitLab, or Atlassian. Offer them classes in parallel frameworks, software development tools, and in agile software development. Research software quality has different objectives than production software. With the exception of core software libraries, HPC codes may evolve each time they run and serve internal technical teams rather than outside customers. Nonetheless, developer productivity and the ability to effectively evolve software over time, remain vital.Today, enterprises are building out new types of high-performance computing focused on analytics of large data sets. Many data analytics parallelize more easily than traditional HPC workloads and in 2004 Google popularized the highly scalable and approachable map-reduce programming A program that can exploit multiple cores on a commodity system generally cannot exploit the scale of HPC without a complete rewrite
< Page 8 | Page 10 >