Home arrow Compare CMS - 2a
Compare CMS - Part 2a - More Features PDF Print E-mail



Compare CMS - Part 2a - More Features



Page 2a:

Text or compiled code
Web Standards
SEO for CMS
The first checks to make
Bad CMS faults
Why search engines like CMS
Servers for CMS
Alternative server applications
CMS LAN servers
Server security for CMS
CMS server round-up




Text or compiled code

All the solutions listed here use a text-based code, as against compiled code. To explain: code basically comes in two flavours - text-based code, which you can read easily as it's written in plain text on a page; and compiled code, which is 'invisible' as you need a decompiler to view it.

The advantages of text-based code in this situation are huge; and there don't seem to be any disadvantages. Speed is one possible disadvantage, but this would mainly apply to local machine high-CPU load applications, and not server-side web applications. Admittedly, a fast and extremely compact codebase like Delphi will give clear advantages in a client-side high-load application; but for online use, a page can be served in a fraction of a second and any faster would be pointless. The server cache acts as a reservoir anyway. Mostly, developers seem to prefer PHP as a codebase. Perl is sometimes used, but has proven too slow for large apps, for some developers: one says rewriting his app in PHP from Perl increased speed by a factor of 10; other devs do not agree on this point, though.

Security is a sensible question, and it's true to say that compiled code solutions are theoretically more secure. Only in ecommerce apps, though, does anyone seem to take the compiled route in order to expedite this advantage. Large commercial CMS can use a fairly exotic code language, like TCL, but this does not apply in the OSS and small commercial CMS field where compatibility requirements and distributed workforces may apply.

One big OSS CMS listed here is written in Java, but since it produces JavaScript pages, it is probably not the best choice now. That's a shame since it is one of the best, but .jsp pages may be a poor route today if you want top search results and are concerned to provide good accessibility levels and a safe browsing environment for your visitors. Of course, it depends on the implementation; if the scripting is kept light, and what script there is can be kept in server-side files that are only called as required, then things may be different and problems will be reduced. A.jsp fileformat page is not in itself a bad idea in any way (that would be unlikely), it's just that such pages are much more likely than most to be loaded up with JS - which most certainly is.

Plone you will note is written in Python, though pages are output in HTML. Views seem to be mixed on this - many say it is one of the most powerful and fastest text-based codes; others don't agree. Plone is certainly the most powerful free CMS, though. Some larger users seem to prefer specific features such as DC compliance - Dublin Core metadata - that Plone can now probably supply. This particular feature is popular with Universities and so on, for indexing purposes, but has yet to make the leap into the mainstream. Many CMS have a major problem with accessibility in any case, and they would need to fix this first before adding minority features. Plone handles hundreds of thousands of pages, and has a higher accessibility rating than most other CMS, so it has several major plus points. It is one of those applications that need a dedicated server, though, as will be explained later.


Web Standards

Web standards are the specifications that code is written to, and the accepted levels of support for requirements such as differing equipment or users. It seems obvious that code should be standard, and applications should work in certain accepted ways; but this is not the case. Complex web applications are not necessarily written in good, correct code; and they may be implemented incorrectly at server level.

The end result, therefore, may on the one hand be good: a fully-conformant webapp that has been conceived and constructed by people who knew about standards, who wanted to provide maximum compatibility and accessibility, and who wished to make their application future-proof, as far as is possible in this fast-moving environment. Or, on the other hand, the application might have simply been written by developers who just went ahead and did it, regardless of the real-world consequences. There are a great many webapps like that and in the past they were the majority.

You will find examples of both, and until you are more familiar with the products, one of the easiest set of checks to make is whether or not the CMS conforms to web standards and real-world requirements. The W3C and the search engines have given you the required parameters; research them and apply them, and choose your CMS accordingly.

Another aspect that is increasingly important, but harder to measure for most people, is the accessibility and acceptability rating of the page codebase and coding. In the past, web pages were laid out using cells and tables. This approach became obsolete around 2002, and all modern web pages (whether static or dynamic, it makes no difference) should be based on layers (divs) and CSS. In addition, there is a choice of many different codebases to work with, in order to provide the required functionality.

In the past, almost any type of code was acceptable; this is no longer the case. Scripted functionality has to be a compromise, now, between what it is necessary to achieve, and what is acceptable from the point of view of accessibility and what search engines will accept. The result comprises a best practice specification. Search engines don't like some scripting methods, and these should be avoided as far as possible. This subject is still controversial, of course - except among those who improve website performance for a living (and the only performance metric in the final analysis is revenue). They know the problems only too well.

A CMS needs to be standards-compliant and search engine friendly - and it needs to be provably so, regardless of what the salesmen or developers say. There are large numbers of people working with applications who are one or more Internet change cycles behind the times, and as yet, just don't get it.


SEO and CMS

This subject follows on neatly from the previous section, web standards. The purpose of SEO is to improve website revenues, visibility, brand strength and business image. It is effected by improving traffic, website presentation, operation, accessibility, usability, marketing, and compliance with search engine requirements.

Actual tasks required include adding material, removing material, tuning content, and repairing incorrect architecture, navigation and operation. A proportion of repairs are easy to carry out; some are impossible since they involve core operational functions. A percentage of work - sometimes a large percentage - involves repairing faulty development.

The easiest sites to work on and improve are those that use a modern layout scheme and a modern accessible codebase; the hardest are those with outmoded layouts, heavyweight scripting, and a lack of facilities to ensure all page data - including metadata - is unique.

The ideal CMS then, in SEO terms, would therefore use a modern page layout scheme; modern lightweight coding; have an absolute minimum of JavaScript; and facilities to allow any single piece of code in the source to be unique, for each page. This applies in particular to metadata.

These requirements are based on the premise that a commercially-viable website needs good search engine traffic; and the best way of achieving that is conformance with the stated points. If no reliance on search traffic is needed, then web standards can be ignored and the CMS can produce any kind of code you like.

In the world of Internet commerce, many take the view that an easier option is simply to generate traffic from online advertising, rather than try to create organic traffic (traffic from search engines). This is a popular choice, and far easier to accomplish than creating organic traffic. One takes cash, one takes skill and an accessible, standards-compliant website; and it's a simple fact that it is often easier to pay for traffic than create it, since that might involve drastic website changes. Or, perhaps more difficult, a change of attitude.

The online world changes every 18 to 24 months in some respects. Because these changes occur gradually, many do not notice them; and a percentage of people are always one or more cycles behind the times.


The first checks to make

One of the first tasks when evaluating a potential CMS, then, is to validate the code of one of their portfolio sites, using the W3C online validator; and then use a similar tool (such as the WebXact validator) to validate the accessibility and allocate a grade. Less than say five code errors is preferable, and fewer than ten highly desirable; over ten code errors means someone doesn't know what they're doing. Not a good message to be sending. Anything over about 30 errors means the coders would be better employed sweeping the street, they certainly can't write code or develop a web application.

Accessibility validation is a tricky subject - at least, it's tricky for web CMS apps to comply. Not impossible, though, as you will find if you validate Plone or Colony. Use one of the many online validators and see how they score. Accessibility is the unknown quantity in web apps right now: no one seems to have a clue what you're talking about. The search engines know, however, and although they don't specifically state that results include an accessibility factor (and may even deny it when pressed), in practice a very large number of accessibility pass / fail points align precisely with best practice for search.

A rough outline of this is that there are three grades: A, AA, and AAA (also known as level 1 etc). Every web site should pass the single-A level; if it doesn't, there are some things to be fixed. AA level or double-A is a tough test, as there are hundreds of pass / fail points to get right. AAA or triple-A is the hardest, and cynics might say that only a Notepad page with three lines of text on it will pass that. Maybe so, but it looks like the web is headed that way, for some classes of sites - public service websites for instance.

Validate your proposed web CMS application for accessibility, then find an expert who can tell you precisely what the result means. To anyone except an expert, the many pages of the result will look like techno-gibberish. If it fails single-A, then your best plan might be to step back and think again. If your proposed CMS can validate to double-A on a large page with plenty of mixed content types, then you are looking at a high quality application with a page put together by people who knew what they were doing. And that isn't an everyday sight.


Bad CMS faults

Not every CMS is perfect for search results. In fact, many aren't anywhere near good enough. There are many errors to watch out for, though some are hard to spot unless you are an expert.

Such errors divide into those that can be seen from outside of access to the server and application; and those that can only be spotted from the inside. The worst, from a search results point of view, can be seen from 'outside' - that is to say, by closely examining a portfolio site and the generated pages.

There are so many of these possible faults it is hard to even choose the worst; here is a far from complete list:

  • Boilerplate metadata. Inability to create unique metadata for every page; each page needs its own, which must not be duplicated (in any respect) anywhere else.

  • Generic metadata format. Many CMS prefix the title or description metadata with the main site name, or use some other similar method - and this is unacceptable. It is better to omit it than do this; though luckily a plugin can often remove it.

  • Multiple duplicate URLs. All dynamic applications create a multitude of page addresses for any single item, depending on the route by which you reach it. There needs to be a facility to remove all these duplicate URLs and show one, and only one, regardless of how the page is reached. This problem is often compounded by URL rewriting; but the best friendly URL solutions take this into account and remove all the duplicates.

  • Page code that doesn't validate. The first task of every developer everywhere should be to validate the page code, after reworking it, creating a plugin, or anything else. This should be Lesson 1 on Day 1 of web coding school. Unfortunately it doesn't seem to even be on the syllabus at most establishments, judging by the results. Even good core CMS code gets wrecked by bad templates and plugins.

  • Heavily scripted page code. Today, we just cannot use the vast amount of heavyweight JavaScript that plagued the web pages of a few years back. Search engines hate it, as it is a primary tool used to attack them with. Unfortunately, this is yet another subject many developers are completely unaware of. The last two items are also closely connected with accessibility validation.

  • On-page scripting. Yet more faulty development - scripts should not be seen on the page, they should be on server-side files called as necessary. There are a multitude of reasons for this. Again, it is down to developer ignorance.

  • Session IDs. One of the worst sins you can find - absolutely unacceptable. This is where the application adds a session ID to the URL of each page, according to who the visitor is. This is wrong for so many reasons we cannot even begin to list them. It's OK within the checkout process of an ecommerce backend - but only there and nowhere else.

  • Bad source-ordering. This refers to the fact that when you view the page source, the most valuable content must be at the top. Normally, that is the page text. If there are menus above that, it is wrong. If there is a bunch of JavaScript above that, the 'developer' should be shot (or sent to a secure establishment for re-education).


However, we should make the point that in the world of dynamic webapps, the majority of CMS now come out rather well when compared to other types of software. For instance, many ecommerce applications are worse. Forums are especially bad here, and some are almost prehistoric in terms of their SEO and usability ratings.


Why search engines like CMS

In many cases we don't need to worry about SEs disliking our software - quite the opposite. It is often found that replacing a hand-coded site with a CMS results in search positions rising markedly. Why might that be?

  • Vast numbers of hand-coded website pages are of low quality in one or more respects - error-filled code, scripted menus, zero SEO score, and so on. But most CM systems now have at least minimum SEO / accessibility levels, so that they will often have superior pagecode.
  • Some things are done better in a WCMS than are likely to be done in a flat site. These include the header metadata and navigation logic.
  • Rather than being incorrect and unwanted, the repetition of certain page elements is sometimes a useful and positive feature. Initially you might think this is always wrong - but you would be mistaken.
  • It used to be a fact that flat HTML pages performed better in search. This is still true to a certain extent - mainly because of the implications rather than any weighting toward that file extension, which would of course be unlikely if not ridiculous. Now, though, there is a new contender: the XHTML pages typically output by CMS perform very, very strongly in search. Organic results are frequently better than expected.

The last point is an interesting one and will no doubt become the subject of some debate. Remember, though - like many interesting ideas - you saw it here first. Naturally, it results from all the implications rather than some strange 'preference' by search engines.


Servers for CMS

The choice you must make is between a LAMP server and IIS; in other words between Linux - Apache or Microsoft Windows Server 2003 IIS. Perhaps that decision has already been made for you, because you are obtaining a CMS to run on your current server.

LAMP rules of course, though some CMS are platform independent. If in doubt go for Linux - Apache - PHP, since you can't go wrong. If you must use a Windows box, then your choices are much more limited. There are few reasons to choose a Windows server, and the only realistic ones are that your CMS is ASP-based, which demands a Windows server; or that your online business providers use Windows - IIS and you devolve all decisions in this area to them.

If your proposed webapp is ASP / CFM / .NET / MS-SQL based, you will of course need a Windows server. Most users prefer a PHP / Flash / SQL solution, which is LAMP-based, but perhaps in the end it is a matter of personal preference. At any rate, there are a very large number of advantages to the LAMP solution.

Another point here is that all IIS servers can have PHP and MySQL installed. Google 'windows hosting php mysql' and you'll see this clearly. Only resellers who don't have access to the server would be in the position of not being able to offer PHP / MySQL on a Windows server (which, if you see our SEO Hosting section for an explanation, is probably a good reason to avoid them).

A windows server also needs PHP and MySQL installed on it in order to use all the helper applications, subsidiary applications, and associated applications that any website needs. In this group are website statistics, blogs, wikis and a dozen other similar webapps that virtually all need PHP and MySQL. Trying to exist without these cornerstones of Internet functionality is like driving a car with a steam engine: sure, it may work, but it isn't 21st century.

As far as Sun Computer servers go, they are (in our opinion) a fine solution if you can afford it, but not for shared server use. They may be perfect if you have a dedicated server, high traffic, a very profitable website, and can afford the elevated support costs. As regards multi-user shared boxes, the implementations we have seen (Solaris / Unix for example) were not optimal. Of course, that may just have been the individual software installs or hosts' errors on those particular servers. It is not acceptable, for instance, to be able to see every other user's webspace and usernames on the server via FTP, in a multi-user environment.


Alternative server applications

Apache is not the only option here, though as regards LAN use it may be the easiest, since it generally installs quickly and easily (via the XAMPP route), and this also installs the required ancillary applications such as Perl.

Here are some alternative server apps which certainly bear investigation. Just keep in mind that you will also need PHP, MySQL and Perl on the server. An FTP and email facility also helps with full testing.

lighttpd -
(pronounced 'lighty'). As used on YouTube, Wikipedia. A fast and capable server app. It is reported to handle heavy loads better than Apache, which may have some memory issues in these circumstances. Fits most *NIX operating systems.
http://www.lighttpd.net

Caudium
- runs on most *NIX OSs. A non-forking monolithic web server. Can be utilised to create non-JavaScript dependent websites since it has a facility to bypass the need for JS scripted functionality in some circumstances. We like that! Written in Pike and C; originally a fork of Roxen Challenger 1.3.
http://caudium.net/index.html

thttpd -
lightweight, fast, compiles on most *NIX OSs.
http://www.acme.com/software/thttpd/

Roxen -
Windows and *NIX OSs. Web-based admin interface, platform-independent architecture.
http://www.roxen.com/products/webserver/

Zeus -
Zeus Web Server (ZWS), ZXTM load balancer. "A great webserver", according to a competitor. Commercial, not open-source.
http://www.zeus.co.uk

Servers for testing a CMS on a LAN

Most CMS in this market area are platform-independent - they run on Linux or Windows. This means you can install it on a PC on your LAN, for testing and trialling. However, on a Windows PC, the required PHP and MySQL are not present; but this is fixed by installing a full-package server application such as XAMPP.
 
A server can therefore be created on any PC; and the PC can run Linux or Windows. A lightweight server app will not be suitable for a Windows PC unless it also includes the (normally required) PHP and MySQL. A Linux PC will need these as well, and depending on whether or not they are present in your chosen Linux distro, they may need to be installed.
 
You can set up a LAN in any number of ways, either cabled or WiFi, though it is normally a little easier, for this purpose, on a cabled LAN.

 

Server security for CMS

There are a lot of issues here. In general, quality hosts do a good job; poor quality hosts are likely to be poor at this as well. Note that 'poor' here equals the observed quality of the service - which has nothing to do with the cost. Some of the worst hosts are the most expensive.
 
The first factor to consider before even signing the contract is PHP default security, since it is often overridden. If this is found, it may be best to go elsewhere, as it indicates the tech support staff (or more likely management) know little about server security, and don't keep in touch with upgrade and security issues.
 
For example you should enquire before signing up that the default PHP security settings that follow have not been changed (i.e. overridden and turned off) on your server:
 
register_globals=off
magic_quotes_gpc=on
 
Register globals must be off, and magic quotes must be on.

Local htaccess and php.ini resetting for these is irrelevant - if these settings are overridden the server is vulnerable - and has been made so by ignorant hosting staff. There are too many hosting support techs who just don't know this. Where we have found this and insisted it had to be changed, the debate went all the way up to management before it was grudgingly admitted that we were right and the settings needed to be changed (i.e. server security should be switched back on instead of being turned off).

All you have to do is check this with PHP central - and take notice of the security announcements from this and similar server application central organisations. If the host is not interested in receiving and actioning vital security, patch and upgrade announcements from Apache, PHP, MySQL and the relevant Linux distro - where does this leave you?

Therefore, checking up on basic PHP security settings is a very useful check here. If they are ignorant on such a basic point, it cannot be a good sign. There are a large numbers of vulnerable servers out there, run by hosts who are too busy to deal with things like security patching, upgrades, exploit fixing and so on. We have even seen hosts out there running PHP3 (obsolete since the 90's) - so be warned.

 
These security settings are sometimes overridden for an old or poorly-written app that cannot work with the settings correctly defined. However, a much better solution than turning server security off is to turn it off locally just for the low-grade application, using a local php.ini or htaccess file. Here is a statement on this from a senior PHP developer:
 
"Turning register globals OFF via a local php.ini or a htaccess file will NOT offer you any extra protection. Another exploited account on your server can simply hack yours. For server security, and since PHP 4.2, register globals is OFF server wide by default (PHP default). Any host overriding this is inviting trouble. If you need register globals ON for a specific site, simply use an htaccess file for that specific directory, and server wide security will not be compromised. Of course, if you do this, be sure all effected scripts fully sanitise input data.
DO NOT OVERRIDE DEFAULT PHP SECURITY."

[i.e. if the correct security setting has been wrongly set and overridden at server level, you are wasting your time trying to reset that locally; it may allow your application to work correctly, but you are still vulnerable since the whole server is able to be exploited].

 

CMS server round-up

Just to make this clearer, in case you are still confused: ideally your server has PHP scripting, a MySQL database or two, and runs Apache Server. This is called a LAMP server; the main alternative is a Microsoft IIS server. Both run on a PC, which is called a server though it is quite often no different from any other PC. A local server in a small office, aka a LANserver, can be any old PC if its main use is fileserving. An Internet production server, though (that is, a public-facing website server), needs to have a decent specification. Plenty of RAM, fast disks in slide-out caddies and a good mainboard head up the spec. Considerably more important, though, are the people running it.

On a LAMP server, the OS is Linux but no need to worry about that, it's transparent anyway (meaning the user doesn't see that). The user being the CMS implementer. The end-user or visitor can't tell what server, OS, or anything else is being used in any case. Unless something goes wrong, in which case if you see an Apache error message or the Windows default one, you'll know a little bit more about the server environment. That's why properly-run servers have entirely custom-made error pages, they don't use the defaults - even for an internal server error. You may have to go a long way to find a properly-run multi-user server.

Occasionally we get asked the following question: can I run MS IIS with Apache (or vice-versa)? There is obviously some confusion here, because each of those items is an independent server application. You cannot run Microsoft IIS on a Linux - Apache server, because IIS is the server manager for Windows - it allows a Windows box to function fully as a server. You can, though, run Apache on a Windows box, as the server app. However, this is only recommended on a dev LAN (a local network for testing), not for production (a livesite on the Net). There are too many security issues, which is why XAMPP for instance, a one-click package for doing just this, is only used for dev LANs.

So: a webserver is best run with Linux as the OS and Apache or one of the alternatives such as Lighttpd as the server app; and if you need a Windows server because your main webapp (a CMS perhaps) is ASP based, then Windows Server 2K3 IIS is the way to go. Don't mix them except on a closed LAN. You can also safely use PHP and MySQL on a Windows server - provided your tech support people know how to install them, which some don't.



 
Bookmark this page:
Spurl
LinkaGoGo
Reddit
NewsVine
Ma.gnolia
Fark
Blinkbits
BlinkList
connotea
feedmelinks
YahooMyWeb
Simpy
© 2008 A3webtech
powered by sail & rum : )