operations
Web site crawler and link checker (free)
Jan 13th
In a previous post I provided a utility called LinkChecker that is a web site crawler and link checker. The idea behind LinkChecker is that you can include it in your continuous integration scripts and thus check your web site either regularly or after every deployment and unlike a simple ping check this one will fail if you’ve broken any links within your site or have seo issues. It will also break just once for every site change and then be fixed the next time you run it. This feature means that in a continuous integration system like TeamCity you can get an email or other alert each time your site (or perhaps your competitor’s site) changes.
As promised in that post, a new version is now available. There’s many improvements under the covers but one obvious new feature is the ability to dump all the text content of a site into a text file. Simply append -dump filename.txt to the command line and you’ll get a complete text dump of any site. The dump includes page titles and all visible text on the page (it excludes embedded script and css automatically). It also excludes any element with an ID or CLASS that includes one of the words “footer”, “header”, “sidebar”, “feedback” so you don’t get lots of duplicate header and footer information in the dump. I plan to make this more extensible in future to allow other words to be added to the ignore list.
One technique you can use with this new ‘dump’ option is to dump a copy of your site after each deployment and then check it into source control. Now if there’s every any need to go back to see when a particular word or paragraph was changed on your site you have a complete record. You could for example use this to maintain a text copy of your WordPress blog, or perhaps to keep an eye on someone else’s blog or Facebook page to see when they added or removed a particular story.
Download the new version here:- LinkCheck <-- Requires Windows XP or later with .NET4 installed, unzip and run
Please consult the original article for more information.
LinkCheck is free, it doesn’t make any call backs, doesn’t use any personal data, use at your own risk. If you like it please make a link to this blog from your own blog or post a link to Twitter, thanks!
Why don’t you trust your build system?
Mar 31st
In this post I’m going to challenge the conventional wisdom that the best place to store configuration information is in XML config files or database entries rather than in code files. A typical comment I see goes something like this “I can’t change routes without recompiling and deploying code. I thought it better to configure routes in a more dynamic environment, specifically the database …”
Some observations:-
1) Why is ‘recompiling and deploying’ an issue? If you have continuous integration and continuous deployment to a test server it should be a non-issue.
2) Databases (alone) lack any tracking metadata – who changed what and when? Who do you blame when your site stops working because someone edited the configuration?
3) Databases and XML files do not provide strong-typing and Intellisense like you can get in the IDE to accesss your configuration settings.
4) Do you *want* your ops team to be able to edit your configuration settings?
So instead of using XML files that can be changed in production environments by anyone without any tracking, why not build your configuration settings into code files where they can be strongly-typed and are always under version control giving you a full history as to who changed what and when it was changed. If, like me, you deploy several times a day because you trust your build and deployment system this is actually the easiest and safest way to make configuration changes.
Looking forward to the new year and our new data center
Dec 31st
In the new year I’m going to be moving all our servers over to a data center in Issaquah, WA. I’m looking forward to having some faster hardware and a better connection able to provide a better experience to all the customers of our digital signage solution.
Happy New Year!
Continuous Integration -> Continuous Deployment
Dec 29th
There was some interesting discussion today around the topic of continuous deployment (pushing incremental builds to production rapidly rather than batching them up.) Here’s some thoughts on that topic:
Personally I think you should be able to make a fix and deploy it to production in a matter of minutes but that you should do this rarely and it depends on the type of fix.
During the early stage of a lean startup you can go without a staging environment: the minimum is a continuous integration server deploying to a local development environment and then the ability to manually push to production as needed. (Note: I consider continuous integration a bare minimum.)
How do you define quality? The formula I have for perceived quality is:-
Sum (Severity of bug x number of times experienced by users)
Clearly data-loss bugs rank high on severity and should be avoided before you get to production but for many other classes of bug IF you can fix it quickly the overall user impact can be minimized more effectively by having a faster build-deploy cycle than by having a perfect test suite. And since the perfect test suite is in any case impossible, you may as well invest in improving your build-deploy cycle process first. In an ideal world no user ever experiences the same bug twice and no two users experience the same bug.
This means however that you need a rock-solid build and deployment system that you trust as much as you trust the compiler. (And, btw if you have this, the need to pull configuration settings into XML files that you can edit on production servers goes away: if you trust your build and deployment system you can make the change in code and rely on the process to make the change in production. As an added benefit you now also have an audit trail as to who changed what and when and you can lock down your production environment so that very few people have direct access beyond deploying new bits to it using the prescribed process. As we all know most problems in production are caused by human error and if you allow people to make random changes there it’s hard to figure out who/what/when and where for any critical error).
Another corollary of this approach is that you need in depth logging and exception reporting. You need to be able to understand what caused a bug to appear and how to reproduce it from a single instance of it happening. Your logging around an exception should include the entire state (which file, which user, cookies, referrer, steps leading up to it, …). You should record exceptions in a database so you can see which ones are most common and can correlate their occurrence to any changes you made to the site. Your error reporting also needs to encompass the javascript running on your customers computers with every javascript exception reported back to your site using a web service. After all, my formula is ‘bugs experienced’ NOT ‘bugs experienced that we happen to know about’!
Another trick a lean startup can use for deployment is to employ Subversion as a binary version control system. i.e. your continuous integration server does a build and then checks the binaries into a different Subversion tree. To deploy (a no-database changes fix) you simply do an SVN update on the production server. It’s fast, efficient and most importantly atomic (unlike XCOPY). It also provides an immediate roll-back capability – simply go back a revision. Another advantage is that you can apply fixes to just one file (e.g. an image or html file) by updating just that file and can be sure that no other files changed in the process. And, again it gives you a complete audit trail so you can see how any file has changed over time and relate that to any changes in exceptions being logged.
So, in summary: major ui changes, database changes should be pushed to production infrequently and in a very controlled fashion but minor ui changes, critical bug fixes, … can happen all day long all the time if you have the right process in place.
When will people learn to backup?
Dec 11th
Had another friend lose a hard drive today without a proper backup. Pain!
I now have at least 3 copies of everything with staggered backups to different hardware. For the digital signage software I manage there are two copies on the servers locally and another copy in the cloud on Amazon’s S3 storage which is itself replicated multiple times.
The basic concept that people need to use is “SHARED NOTHING”.
RAID is NOT the answer to data security, it’s a convenient recovery mechanism for failed hard drives but if your data is on two drives connected to the same RAID controller card on the same computer in the same room you have plenty of opportunities to lose it all.
Backups should ROTATE. Backing up to the same location risks a failure in the backup that could wipe both copies, or copy bad data over good before the mistake is discovered. For really critical data I have a daily backup, a weekly backup, and a monthly backup. I also have two backup schedules, morning that copies to one set of drives on controller A and an evening backup that copies to a different set of drives on a different controller.