To make sense of big data, look inward

Having difficulty understanding big data? Don’t examine it on the Internet. Instead, look at your own company’s version of it. From e-mail systems to invoicing to accounts payable to text messaging to document sharing to HR, the list of disparate systems on which a business relies can easily number in the hundreds. The sum total of all this data is already big, and it changes and gets bigger every day.

This is more than a useful analogy. Many organizations use big-data analysis lessons to draw meaning out of the confluence of data from two or more of their own internal systems, much as big-data practitioners do with information on the Internet. The list of the big data wins an organization can score ranges from better business development and improved billing structures to e-discovery readiness and protection from fraud — and beyond.

“Banks and retail have figured this out,” says Mark Hayes, managing director at Heydary Hayes PC, “but smaller companies also have valuable data. [However,] it’s not co-ordinated, it’s not gathered in the same place, it’s not linked together. Companies are probably leaving a lot of value on the table.”

Extracting that value involves merging big-data thinking with data governance. The tips in this article will help you realize that value in your organization.

A successful data governance initiative syncs with the organizational culture in which it lives. That culture includes procedures, policies, current technologies, and stakeholder perspectives that can range from litigation, intellectual property management, end-user concerns and other factors.

“You’re trying to change the way people use data,” says Hayes. “If you try to force it, expect resistance.”

He recommends designating a senior person to take charge of data governance initiatives, handle information co-ordination across the organization and deal with divergent interests.

Consider what goals you want to accomplish. “We’re still seeing so many organizations in Canada that are where they were five years ago in terms of e-discovery readiness,” says Susan Wortzman, founder and partner at Wortzman Nickle Professional Corporation.

“They’re still responding on a transactional basis. They haven’t got the informational governance in place. The cost is more than it should be,” she says.

One solution is to perform an information audit. “Companies need to systematically catalogue what they have, what they gather and what they could gather,” Hayes says.

To better grasp the importance of this exercise, Martin Felsky proposes the following hypothetical question: “If you sold it on the open market, who would buy it and what it would be worth?” asks the Borden Ladner Gervais e-discovery counsel.

Whoever maps all the data in an organization will likely find some of it embedded in silos — myriad systems of varying vintages (some of which could date back to the 1970s) storing data in many different formats — as well as housed in web-based systems like salesforce.com and social media sites like Facebook, LinkedIn, Twitter and Pinterest.

While many social networks provide distinct application-programming interfaces (APIs) that allow people to capture site information, some companies choose instead to enact policies to keep the company from knowingly profiting from social media accounts, to avoid this duty to capture data.

Once all relevant data sets have been identified and understood, choose the right big data tools for the job. Since few organizations boast the requisite tools or processing power, Felsky suggests using the cloud. “You can rent as much space as you need, and you can rent the tools you need for any given project,” he says.

A privacy audit keeps organizations out of hot water. “You can always do more than you’re permitted to do,” Hayes notes, adding that exceptions in privacy legislation and changes to internal policies may increase the latitude a company currently has.

Merging data on to one platform makes mining it much easier. Perhaps more importantly, today’s silos don’t help fulfil real-world work requirements.

“Here’s the type of search companies do on their systems,” says Chris Grossman, senior vice-president of enterprise applications for Rand Worldwide, rifling off a list of criteria: “All communications that occurred about a particular topic between particular organizations, regardless of where they exist — Salesforce (a cloud computing company), e-mail, Information Management, voice-over-IP — if documents were exchanged…”

He also offers the further complication of different languages used within a multinational organization as a situation in which keyword search takes a back seat to meaning-based or topic-based search.

Dominic Jaar, KPMG Canada’s national practice leader, information services, admits that security is one of the arguments against aggregated solutions. His rebuttal: “You can spend money on average security for a bunch of different systems, or the same money on top-notch security for the one system.”

He adds that it’s easier to control access to one specific data repository than to several.

An easier alternative to merging data from various systems may be a federated search, in which one tool searches through many systems. However, federated-search tools can struggle when faced with in-firm systems plus online sources, and the attendant variety of database structures. “For most law firms, it doesn’t work,” Jaar says.

Once it’s set up to be more easily scanned, you can mine data for insights, including relationships that might not have been perceptible before.

Companies that operate at a higher level of data governance maturity can run real-time analytics. “Every day, you can check a dashboard with all the information you capture,” Jaar suggests.

This article originally published by Lawyers Weekly Magazine. To view a PDF of the print version, click here.

Leave a Reply