After reading the book Fearless Change: Patterns for Introducing New Ideas it became clear that introducing new ideas and concepts, getting them across, and then doing them is a full-time job that should not be taking on by one person. It also should not be taking lightly. Needless to say, I wish I would of read the book before doing so as it may have saved myself lots of time. However, it seems the biggest hurdle is out of the way and we are finally getting down to the brass tax. This presents a new problem that once the idea is introduced and you start working towards the goal, you need someone who also understands the concepts that you are speaking of to explain this in laymens terms. They need to know the limitations between disparate systems etc etc. Especially if you are by yourself, it gets difficult trying to actually do the programming, be the vocal piece, explain your ideas, mock everything up and bringing it to fruition without being stopped. Especially, when dealing with an idea as obtuse as that of content. It also doesn't help when people take their idea of content and then suggest how it should be managed. It's a discussion espousing MVC and simple separation of presentation and functionality at the very least and it's hard getting those concepts across to people who see a webpage and think of it as an actual page instead of different pieces of data presented on one page. This is all subjective though, because others don't see the content the way a content manager or system administrator would. So it's an education process which I really am struggling to find patience for but i'm getting better at slowly. It helps having someone who generally allows me to solve problems and I realize, it'll probably be the same where ever I go; it's extremely difficult to swallow sometimes though. It seems a large amount of this industry is so filled with bad habits that when you start talking about a smarter way to do things people can't logically stop thinking one way. In this case, the difference between static and dynamic publishing or understanding that data in two different systems is notoriously difficult to get working properly. The WordPresses and the MovableTypes (can be dynamic, even though there is no object model available) have made it very difficult for fast paced industries or consistently changing publishing houses to get their content out in proper fashion. At the heart of it is the fact that nothing is really static. Lets face it, this website may stay the same for weeks on end, but AOL, Conde Nast (magazine websites), Nymedia, Digg, Universal, Apple etc etc their websites are constantly changing. Everyday the content is changing, the template or design of the site may never change but content is being added, deleted, removed. Apple is adding discounts or new computers, digg is adding stories and comments and Nymedia is adding all sorts of stuff, images, slideshows, articles, video. This, also compounded by the fact that the data at most of these places is simply not flat. It's not just a blog post or entry. It's a list of models being associated with a list of places or topics, being displayed on a webpage. So we are dealing with objects and for that we need an object database.
From the wikipedia link some important paragraphs;
"Object databases based on persistent programming acquired a niche in application areas such as engineering and spatial databases, telecommunications, and scientific areas such as high energy physicsmolecular biology. They have made little impact on mainstream commercial data processing, though there is some usage in specialized areas of financial services[citation needed]. It is also worth noting that object databases held the record for the World's largest database (being first to hold over 1000 Terabytes at Stanford Linear Accelerator Center "Lessons Learned From Managing A Petabyte") and the highest ingest rate ever recorded for a commercial database at over one Terabyte per hour."
How come the publishing and media industry has been so slow to adopt object databases when there is a clear need in regards to spatiality and data size is beyond the scope of this entry. However, I believe part of it has to do with the fact that the industry as a whole has been pushing Relational Databases for so long that they instead of asking do we actually want to do this in RDBMS or should we consider our data model and if it's more object oriented maybe a ODB. They simply say, Oracle, SQL Server, Postgres or MySQL. Which while good for relationships planned out ahead of time and then normalized become a nightmare to manage when you start discussing object data, relationships, normalization and all that comes with it, including a consistently changing data model which isn't the one-oft but the norm. As data isn't static, content isn't static; it simply doesn't make any sense. Stop and ask yourself if you think your content management software is taking all of these things into account, can it scale and how difficult it is to manage. Is your data model ever going to change, do you have a DBA on staff, are your programmers normalizing the data and tables they use etc etc. Back to the Wikipedia article
The efficiency of such a database is also greatly improved in areas where you often need to find out lots of data about one thing. For example a banking institution could get the user's account information and provide them with many things in a high efficiency such as transactions, account information entries, etc. The Big O Notation for such a database paradigm drops from O(n) to O(1) greatly increasing efficiency in these specific cases.ÂÂ
How often have you gone to your banks website and had to have a different username for a different system? How many times have you called an institution to be greeted with "Sorry Madam or Sir we can't complete this request because some system is down or we don't have access to that information; we'll have to transfer you". How many times have you been at work and a project manager requested all the information on our users who like models with blonde hair in the last 6 months. "Yeah, i'll write a script to build the queries to generate that report and have that to you say in a couple of hours or that other information is in a table/db that if I join will most likely kill performance or that data is in two different systems so my query will have to pull the data and put it into some array and build it from there." THAT; VS a query on an object database is the difference between night and day or O(n) and O(1). Most of those excuses are because the data is simply not related properly and to do so would involve a lot of work. If an object database was used those sort of excuses if given would be related to security. It's not to say it would be faster, even though it most likely would be; because the relational data is most likely not normalized but it's clear as sky.
Here's a nice animation linked to from the Wikipedia article.
This is not to say that they aren't problems with object databases. Some proponents have stated that they are slower because of their strict structures and pointer operations which would leave one to be bemused. Valid argument for 1980; not a valid argument for 2008. Also cited frequently is the lack of query tools; all of which actually exist today for all of the major ODB's.
Another item of note is inter office politics. Fearless change states that proponents can make things very difficult especially when others think they understand the problem. I've seen this behavior myself; it's obvious they have no clue; and I don't mean that in a bad way. It's just obvious when someone is knocking your idea but they simply haven't thought it through; especially in my case when you start talking corner cases, scale and a totally different model of static and dynamic systems. This is something you need to beware of and marginalize with facts. At each turn hit them with a fact or scenario where their comments fail; not to be an asshole but to hopefully to get them actually thinking about the problem. Maybe they don't see the big picture so it's your job to at least reach out on get these people on board.
All that said, sometimes this industry really sucks when you don't work with like minded people.. it's always an uphill battle when you think differently. Sometimes I wish I could just strike out on my own but of course, I don't own a media company at present.
On a completely unrelated note this site: Cityfile is very well done. I suspect they are using Ror, Plone and/or at the very least had their data model spec'd out ahead of time if they are using a RDBMS and Java or something like that.
Christopher Warner is part genius, part idiot. This makes him well balanced. He's worked on numerous opensource projects with great people and has generally led an eventful and fulfilling life. He hopes to retire an old man in a rocking chair should he be so fortunate.
Content management, office politics and why there is no such thing as static content
After reading the book Fearless Change: Patterns for Introducing New Ideas it became clear that introducing new ideas and concepts, getting them across, and then doing them is a full-time job that should not be taking on by one person. It also should not be taking lightly. Needless to say, I wish I would of read the book before doing so as it may have saved myself lots of time. However, it seems the biggest hurdle is out of the way and we are finally getting down to the brass tax. This presents a new problem that once the idea is introduced and you start working towards the goal, you need someone who also understands the concepts that you are speaking of to explain this in laymens terms. They need to know the limitations between disparate systems etc etc. Especially if you are by yourself, it gets difficult trying to actually do the programming, be the vocal piece, explain your ideas, mock everything up and bringing it to fruition without being stopped. Especially, when dealing with an idea as obtuse as that of content. It also doesn't help when people take their idea of content and then suggest how it should be managed. It's a discussion espousing MVC and simple separation of presentation and functionality at the very least and it's hard getting those concepts across to people who see a webpage and think of it as an actual page instead of different pieces of data presented on one page. This is all subjective though, because others don't see the content the way a content manager or system administrator would. So it's an education process which I really am struggling to find patience for but i'm getting better at slowly. It helps having someone who generally allows me to solve problems and I realize, it'll probably be the same where ever I go; it's extremely difficult to swallow sometimes though. It seems a large amount of this industry is so filled with bad habits that when you start talking about a smarter way to do things people can't logically stop thinking one way. In this case, the difference between static and dynamic publishing or understanding that data in two different systems is notoriously difficult to get working properly. The WordPresses and the MovableTypes (can be dynamic, even though there is no object model available) have made it very difficult for fast paced industries or consistently changing publishing houses to get their content out in proper fashion. At the heart of it is the fact that nothing is really static. Lets face it, this website may stay the same for weeks on end, but AOL, Conde Nast (magazine websites), Nymedia, Digg, Universal, Apple etc etc their websites are constantly changing. Everyday the content is changing, the template or design of the site may never change but content is being added, deleted, removed. Apple is adding discounts or new computers, digg is adding stories and comments and Nymedia is adding all sorts of stuff, images, slideshows, articles, video. This, also compounded by the fact that the data at most of these places is simply not flat. It's not just a blog post or entry. It's a list of models being associated with a list of places or topics, being displayed on a webpage. So we are dealing with objects and for that we need an object database.
From the wikipedia link some important paragraphs;
"Object databases based on persistent programming acquired a niche in application areas such as engineering and spatial databases, telecommunications, and scientific areas such as high energy physicsmolecular biology. They have made little impact on mainstream commercial data processing, though there is some usage in specialized areas of financial services[citation needed]. It is also worth noting that object databases held the record for the World's largest database (being first to hold over 1000 Terabytes at Stanford Linear Accelerator Center "Lessons Learned From Managing A Petabyte") and the highest ingest rate ever recorded for a commercial database at over one Terabyte per hour."
How come the publishing and media industry has been so slow to adopt object databases when there is a clear need in regards to spatiality and data size is beyond the scope of this entry. However, I believe part of it has to do with the fact that the industry as a whole has been pushing Relational Databases for so long that they instead of asking do we actually want to do this in RDBMS or should we consider our data model and if it's more object oriented maybe a ODB. They simply say, Oracle, SQL Server, Postgres or MySQL. Which while good for relationships planned out ahead of time and then normalized become a nightmare to manage when you start discussing object data, relationships, normalization and all that comes with it, including a consistently changing data model which isn't the one-oft but the norm. As data isn't static, content isn't static; it simply doesn't make any sense. Stop and ask yourself if you think your content management software is taking all of these things into account, can it scale and how difficult it is to manage. Is your data model ever going to change, do you have a DBA on staff, are your programmers normalizing the data and tables they use etc etc. Back to the Wikipedia article
The efficiency of such a database is also greatly improved in areas where you often need to find out lots of data about one thing. For example a banking institution could get the user's account information and provide them with many things in a high efficiency such as transactions, account information entries, etc. The Big O Notation for such a database paradigm drops from O(n) to O(1) greatly increasing efficiency in these specific cases.ÂÂ
How often have you gone to your banks website and had to have a different username for a different system? How many times have you called an institution to be greeted with "Sorry Madam or Sir we can't complete this request because some system is down or we don't have access to that information; we'll have to transfer you". How many times have you been at work and a project manager requested all the information on our users who like models with blonde hair in the last 6 months. "Yeah, i'll write a script to build the queries to generate that report and have that to you say in a couple of hours or that other information is in a table/db that if I join will most likely kill performance or that data is in two different systems so my query will have to pull the data and put it into some array and build it from there." THAT; VS a query on an object database is the difference between night and day or O(n) and O(1). Most of those excuses are because the data is simply not related properly and to do so would involve a lot of work. If an object database was used those sort of excuses if given would be related to security. It's not to say it would be faster, even though it most likely would be; because the relational data is most likely not normalized but it's clear as sky.
Here's a nice animation linked to from the Wikipedia article.
This is not to say that they aren't problems with object databases. Some proponents have stated that they are slower because of their strict structures and pointer operations which would leave one to be bemused. Valid argument for 1980; not a valid argument for 2008. Also cited frequently is the lack of query tools; all of which actually exist today for all of the major ODB's.
Another item of note is inter office politics. Fearless change states that proponents can make things very difficult especially when others think they understand the problem. I've seen this behavior myself; it's obvious they have no clue; and I don't mean that in a bad way. It's just obvious when someone is knocking your idea but they simply haven't thought it through; especially in my case when you start talking corner cases, scale and a totally different model of static and dynamic systems. This is something you need to beware of and marginalize with facts. At each turn hit them with a fact or scenario where their comments fail; not to be an asshole but to hopefully to get them actually thinking about the problem. Maybe they don't see the big picture so it's your job to at least reach out on get these people on board.
All that said, sometimes this industry really sucks when you don't work with like minded people.. it's always an uphill battle when you think differently. Sometimes I wish I could just strike out on my own but of course, I don't own a media company at present.
On a completely unrelated note this site: Cityfile is very well done. I suspect they are using Ror, Plone and/or at the very least had their data model spec'd out ahead of time if they are using a RDBMS and Java or something like that.
Related Posts:
About Christopher Warner
Christopher Warner is part genius, part idiot. This makes him well balanced. He's worked on numerous opensource projects with great people and has generally led an eventful and fulfilling life. He hopes to retire an old man in a rocking chair should he be so fortunate.