Demystifying Metadata Management

Tom Jesionowski 1 Comment - Leave Comment

Demystifying Metadata Management
or “Cool, a 20G metadata repository….”
By Tom Jesionowski, Prime Data Consulting
© 2008

Try to pitch the idea of a metadata management solution to the average business manager and just watch their eyes glaze over. Most either don’t care or just don’t get why it is something they should spend money on. Add to the problem the overuse of metadata as the holy grail of data management and the problem of getting buy-in gets worse.

The primary problem with the pitch and sales angle is you normally have the technical team at the center of it. Let’s face it if the technical team was really good at sales they would be doing technical sales. To get the funding needed from your business management team, the first task is building common understanding. The concept of metadata management needs to be something that resonates with them.

One question to ask is, “Do they know what the world’s most popular metadata repository application is?” You’ll get some glazed looks and stares, but most likely no answer. The next question is, “How many of you use MP3 players?”

Yes, MP3 players are by far the most popular use of metadata. The music players can sort your music data file by artist, album, song, genre or custom playlist. They provide easy efficient access to the data through metadata management. In a very short time, MP3 players eliminated the need to sort and resort music collections the way John Cusack did in the movie High Fidelity.

Take that understanding and turn it towards your business. What could you do if you knew all the places where you have the field CUSTOMER_NAME, or the variations such as CUST_NM? What could you accomplish with the time your developers and DBA’s spend defining this in their data models for the umpteenth time.

But having the metadata repository alone did not create the tipping point for MP3 players and they won’t for an enterprise metadata repository either. The data entry component plays a critical role. For example, what contributed to the success of the MP3 player was the Compact Disk Database (CDDB) coupled with user submissions. The metadata capture process was distributed and structured for consistency.

This is an important point. A critical success factor for metadata management and the path to avoid the collection of metacrap is process controls, or Data Governance practices. The process must make it easy to enter and easy to retrieve for reuse.

Remember, that first MP3 player you got for your birthday. After getting over how stoked you were, you spent the next couple of days loading your CD collection. Can you imagine what an impossible task that would have been without the CDDB to house and redistribute all that metadata?

Now change the vision, to that last business application your company integrated. The project team had to figure out how to bring over and integrate all that data. How many hours of precious development time needed to be assigned to the task? How much less would it have been with a decent metadata application and process? There is your ROI for a repository acquisition and metadata process development.

1 Comment - Leave Comment

Introducing Tom Jesionowski

.Gwen Thomas No Comments - Leave Comment

I’m so excited! My good friend, Tom Jesionowski, will be posting here, and, I’m sure you’ll enjoy as much as I do his unique perspectives.

Make sure to read his first post here, “Demystifying Metadata Management.” You’ll never look at your mp3 player again the same way.

No Comments - Leave Comment

Tragedy of the Commons

.Gwen Thomas No Comments - Leave Comment

Somebody wrote to me saying they could no longer find an article I’d written about the relationship between the phenomenon of the “Tragedy of the Commons” and working with data. Could I please, they asked, put it back up on the site?

They couldn’t find it because the article wasn’t originally published here. It was done for z/Journal (www.zJournal.com), where I’ve been working for several years with their amazing editor, Amy Novotny.

Anyway, Amy’s ok with reprinting information, so here’s the piece. And be sure to check out their publication.

Tragedy of the Commons

Nobody likes to be told “no.”

Most of us aren’t fond of saying it, either. But if you work with mainframe data, chances are you’re frequently put in the position of having to say No. Sometimes circumstances make this easy: the request would take your organization out of compliance with a contract, a law, or a regulation.

But then there are the other types of circumstances. The ones where compliance is required, but HOW you comply is up to interpretation. The ones where your organization has not created a clear set of rules to be enforced. The ones where they look to professionals like you to interpret the situation and then make a call.

Sometimes it’s easier to say No in such a situation if you can get the person who’s made a request to support the underlying reason behind your refusal. Today we’re going to talk about something everyone can support: avoiding tragedies.

The Tragedy of the Commons

This concept is often used to describe ecological situations, where the “commons” referred to are grazing lands open to all. However, the concept can also be applied to working with data, especially common resources (such as sets of mainframe data) that are shared by many individuals or groups but are critical to the success of the entire organization.

Have you ever been to the city of Boston , in the U.S.A. ? Early in the country’s history, the Boston Commons was covered with sheep owned by individuals, but nourished by grass grown on common lands. Today, this is a beautiful, lush public park; the story of sheep is often treated as a “cute” history lesson.

A desert in sub-Saharan Africa , however, holds a different type of history lesson – one that has resulted in tragedy for thousands and a lesson for us all.

Just forty years ago, the Sahel region in sub-Saharan Africa was a fertile pastureland. It supported over a hundred thousand herdsmen and over a half million head of zebu (their grazing cattle). The area seemed to be thriving; from the 1920s there can been a steady population growth of people and cattle. Between 1955 and 1965 the area had received both unusually heavy rains and the advantages that came from the development of deep wells.

Herdsmen were optimistic. Herd sizes increased. Eventually there were more cattle than the commons could support, so in the 1960s that area experienced overgrazing. This lead to loss of vegetation, which in turn lead to soil erosion. By the early 1970s, lush land had become desert. 50-80% of the livestock was dead, and much of population was destitute.

Looking back, the reason was clear: individuals took actions with common resources that – when considered individually – seemed reasonable. But collectively, they added up to more than the system could support, and the system collapsed. Ecologist Garrett Hardin wrote about Sahel and this phenomenon he called “The Tragedy of the Commons.”

Acting Selfishly

It’s human nature for individuals or organizational silos to be selfish – especially if they can gain by acting selfishly in the short-term. It’s also natural for the cumulative effect of selfish actions to go unnoticed. But if no one is watching over the health of the commons, a tipping point could be reached, where irretrievable harm to the greater system is done.

Data Commons

Peter Senge, in his book The Fifth Discipline, spoke of other situations where we must apply systems thinking to protect our own commons. It’s an excellent book; one everyone should read. But you don’t have to read it, I’ll bet, to cite examples of how you’re expected to protect your organization’s data commons. Are you expected to assist with Data Quality? Availability? Compliance with regulatory requirements? Many disciplines – and many corporate programs – have sprung up to protect data commons.

It’s worth noting, though, what most of these programs haven’t learned from ecological tragedies.

According to Senge, you avoid The Tragedy of the Commons with one of two strategies: through centralized management, or through voluntary self-restraint. Either can work – but ONLY if they have the right support structure and cultural values.

For the first strategy to work, the manager of the commons has to have power to say No. Period. End of discussion. Your sheep ain’t coming in, so go away.

For “voluntary self-restraint” strategy to work, you have to know what puts your commons at risk, the chances of those events occurring, and where your tipping point may lie. In other words, you need to have a Risk Assessment. You also need clear metrics, and a method of letting everyone know that the commons is being stressed. You need clear guidelines for what to do when it’s stressed, and accepted roles and accountabilities for when the organization has moved into “save the commons” mode. You need individuals who are willing to place the health of the whole over their immediate wants and needs, and you need a system of governance for resolving disputes.

The underlying goal of may compliance efforts is to protect the integrity of data commons. No doubt you’ve adopted the centralized management strategy for some compliance issues, such as Sarbanes-Oxley or Access Management. If your organization is trying to avoid this approach for other applications, they need to remember the support that must be in place for voluntary compliance and for you, if they’re expecting you to say No so you can avoid your own version of The Tragedy of the Commons.

No Comments - Leave Comment