| | Feb - 20189CIOReviewControls can be put in place on the sandboxes, 90 day expiration, charge-back methods, etc. to help establish outer boundaries for compute, storage, and network traffic.A marketing initiative purchases data, for analyzing price sensitivity or market basket offerings within a specific segment. Once they find a profitable segment, the EUREKA moment, the marketing process begins across production channels. Analytics on production customer data, blended with purchased market data, sending leads to production channels. ProvisioningControlling access to data tends to occur within technology, sometimes as part of a project, following information security guidelines. This has a tendency to design access control from an application, or from a source perspective. This model falls down in several key areas.- Technology will design access within a project and may not have a deep enough understanding of business utilization to appropriately assess the inherent risk. This drives towards tighter controls than are necessary when balanced with the level of risk.- Justification processes are performed at an individual basis which leads to inconsistent entitlements.- Multitude of sources. Data Lakes need information from many sources. If each source has a different provisioning process, it could complicate provisioning in the lake.One concept for addressing this is to ensure the appropriate first line of defense is defined. For most companies, first line of defense is the business or department directly, but not IT. This reduces, but doesn't eliminate, the role that technology plays and can enable a more balanced approach to risk and control. This also elevates the accountability of the business in defining the right level of risk.A second opportunity is to change the design from where data is sourced, to where it is consumed. This is commonly known as a ROLE based provisioning process. Aligning access control with a specific department more closely ties the data needs to relevant policies and justifications. Role based design also eliminates the complexities for provisioning data from multiple sources.Practical Application:By having a role called FINANCE, all GL, Transaction, and Account related information can be provisioned. Most roles in finance would have a consistent business need to know. Applicable relevant policies, i.e.: Sarbanes-Oxley would be applied to anyone with this role. On the other hand a second role called MARKETING, could require masked customer data, with relevant Privacy or GLBA policies implemented with this role.One Size Does Not Fit AllEach usage of data has a unique context. The performance, quality, ease of use, cost, speed to decision, all can vary. Regulatory reporting may require a GL reconciled data mart with full lineage from when data is acquired to when it is put on a report. Hurricane Katrina is coming, do we have any customers, or inventory, or facilities that will be impacted, has a different level of urgency. The data quality can be best at hand.Segmenting your data lake to support different analytic needs will be helpful in driving success of the consumer. Many analysts are used to data wrangling with data from disparate sources. Some information consumers need a more structured organization of the data assets.Plan for a variety of tiers, perhaps starting with three, RAW, FULLY ORGANIZED, and in between. Be prepared to allow for using data across sandbox, raw, fully organized etc. Shaun RankinSegmenting your data lake to support different analytic needs will be helpful in driving success of the consumer
<
Page 8 |
Page 10 >