<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:series="http://unfoldingneurons.com/"
	>

<channel>
	<title>Technology Musings &#187; Software Design</title>
	<atom:link href="http://www.technologymusings.com/category/softwaredesign/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.technologymusings.com</link>
	<description>Thoughts about Technology and Startup&#039;s</description>
	<lastBuildDate>Thu, 19 May 2011 18:57:56 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.2</generator>
		<item>
		<title>Real Life Issues With Big Data In The Enterprise &#8211; The Issues With Data Completeness</title>
		<link>http://www.technologymusings.com/real-life-issues-with-big-data-in-the-enterprise-the-issues-with-data-completeness/</link>
		<comments>http://www.technologymusings.com/real-life-issues-with-big-data-in-the-enterprise-the-issues-with-data-completeness/#comments</comments>
		<pubDate>Tue, 08 Mar 2011 11:58:54 +0000</pubDate>
		<dc:creator>Paul Michaud</dc:creator>
				<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[CIO]]></category>
		<category><![CDATA[Enterprise Data Modeling (EDM)]]></category>
		<category><![CDATA[Data Completeness]]></category>

		<guid isPermaLink="false">http://www.technologymusings.com/?p=303</guid>
		<description><![CDATA[<p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/real-life-issues-with-big-data-in-the-enterprise-the-issues-with-data-completeness/">Real Life Issues With Big Data In The Enterprise &#8211; The Issues With Data Completeness</a> </p><p>So completeness can mean a lot of different things.&#160; In this case I am going to define a piece of data as being complete if the description of the data contains all of the available information about the item in question and if that description is captured in a manner which represents a true representation ...</p></p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/real-life-issues-with-big-data-in-the-enterprise-the-issues-with-data-completeness/">Real Life Issues With Big Data In The Enterprise &#8211; The Issues With Data Completeness</a> </p>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/real-life-issues-with-big-data-in-the-enterprise-the-issues-with-data-completeness/">Real Life Issues With Big Data In The Enterprise &#8211; The Issues With Data Completeness</a> </p><p>So completeness can mean a lot of different things.&#160; In this case I am going to define a piece of data as being complete if the description of the data contains all of the available information about the item in question and if that description is captured in a manner which represents a true representation of that object in a context neutral manner.&#160; In my experience,&#160; this is the single biggest cause of the data issues in the enterprise.&#160; In fact if this was done well the issues listed in the first article of this series would be much less likely to occur.</p>
<p>To read the complete article, you can find it on the <a href="http://www.nebility.com/real-life-issues-with-big-data-in-the-enterprise-the-issues-with-data-completeness/">Nebility Blog</a>.</p>
<p>As always, you can reach me through the comments, at <a href="http://www.nebility.com">Nebility</a>, on <a href="http://www.twitter.com/techmusings">Twitter</a>, <a href="http://www.linkedin.com/in/paulkmichaud">LinkedIn</a> or by using the <a href="http://www.technologymusings.com/expert-technology-advice">Ask the Experts</a> form.</p>
<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/real-life-issues-with-big-data-in-the-enterprise-the-issues-with-data-completeness/">Real Life Issues With Big Data In The Enterprise &#8211; The Issues With Data Completeness</a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.technologymusings.com/real-life-issues-with-big-data-in-the-enterprise-the-issues-with-data-completeness/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<series:name><![CDATA[Challenges When Dealing With Big Data]]></series:name>
	</item>
		<item>
		<title>Real Life Issues With Big Data In The Enterprise &#8211; The Issues With Data Consistency (Or Lack Thereof)</title>
		<link>http://www.technologymusings.com/real-life-issues-with-big-data-in-the-enterprise-the-issues-with-data-consistency-or-lack-thereof/</link>
		<comments>http://www.technologymusings.com/real-life-issues-with-big-data-in-the-enterprise-the-issues-with-data-consistency-or-lack-thereof/#comments</comments>
		<pubDate>Tue, 08 Mar 2011 03:14:22 +0000</pubDate>
		<dc:creator>Paul Michaud</dc:creator>
				<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[CIO]]></category>
		<category><![CDATA[Executive Discussions]]></category>
		<category><![CDATA[Data Consistency]]></category>

		<guid isPermaLink="false">http://www.technologymusings.com/?p=299</guid>
		<description><![CDATA[<p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/real-life-issues-with-big-data-in-the-enterprise-the-issues-with-data-consistency-or-lack-thereof/">Real Life Issues With Big Data In The Enterprise &ndash; The Issues With Data Consistency (Or Lack Thereof)</a> </p><p>Large Enterprises face huge challenges when dealing with their Big Data.  In this article I am going to outline some of the common challenges with Big Data I see firms dealing with on a day to day basis.  This is a continuation of the discussion that was started in the article titled “The Challenges of ...</p></p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/real-life-issues-with-big-data-in-the-enterprise-the-issues-with-data-consistency-or-lack-thereof/">Real Life Issues With Big Data In The Enterprise &ndash; The Issues With Data Consistency (Or Lack Thereof)</a> </p>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/real-life-issues-with-big-data-in-the-enterprise-the-issues-with-data-consistency-or-lack-thereof/">Real Life Issues With Big Data In The Enterprise &ndash; The Issues With Data Consistency (Or Lack Thereof)</a> </p><p>Large Enterprises face huge challenges when dealing with their Big Data.  In this article I am going to outline some of the common challenges with Big Data I see firms dealing with on a day to day basis.  This is a continuation of the discussion that was started in the article titled “<a href="http://www.nebility.com/the-challenges-of-dealing-with-big-data/">The Challenges of Dealing With Big Data</a>”.</p>
<p>In the previous article we discussed how a lot of firms and discussions, in and out of the press, are focused on how to analyze and gain insight from Big Data (whether it be on Twitter or in the traditional Enterprise).  Furthermore, I outlined how, in my personal experience, the root of the true problems with big data are often not in how or what tools we use to analyze the data, but more so in how we capture, or fail to capture it in the first place.  In essence, our failure to capture the data accurately and consistently often renders analysis of it a meaningless exercise due to the Garbage In = Garbage Out (GIGO) principle.  To make this issue more clear, I am going to provide some real world examples of some of the Big Data issues I come across with my clients on a regular basis.  Unfortunately, as I started writing this it was getting more than a bit long so I have broken it into three shorter posts of which this is the first one.</p>
<p>The full article can be found on the <a href="http://www.nebility.com/real-life-issues-with-big-data-in-the-enterprise-the-issues-with-data-consistency-or-lack-thereof/">Nebility Blog</a>.</p>
<p>As always, you can reach me through the comments, at <a href="http://www.nebility.com">Nebility</a>, on <a href="http://www.twitter.com/techmusings">Twitter</a>, <a href="http://www.linkedin.com/in/paulkmichaud">LinkedIn</a> or by using the <a href="http://www.technologymusings.com/expert-technology-advice">Ask the Experts</a> form.</p>
<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/real-life-issues-with-big-data-in-the-enterprise-the-issues-with-data-consistency-or-lack-thereof/">Real Life Issues With Big Data In The Enterprise &ndash; The Issues With Data Consistency (Or Lack Thereof)</a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.technologymusings.com/real-life-issues-with-big-data-in-the-enterprise-the-issues-with-data-consistency-or-lack-thereof/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<series:name><![CDATA[Challenges When Dealing With Big Data]]></series:name>
	</item>
		<item>
		<title>The Challenges of Dealing With Big Data</title>
		<link>http://www.technologymusings.com/the-challenges-of-dealing-with-big-data/</link>
		<comments>http://www.technologymusings.com/the-challenges-of-dealing-with-big-data/#comments</comments>
		<pubDate>Fri, 04 Mar 2011 16:15:32 +0000</pubDate>
		<dc:creator>Paul Michaud</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Enterprise Data Modeling (EDM)]]></category>
		<category><![CDATA[Executive Discussions]]></category>
		<category><![CDATA[Nebility]]></category>
		<category><![CDATA[Strategy]]></category>
		<category><![CDATA[Enterprise Data Architecture]]></category>

		<guid isPermaLink="false">http://www.technologymusings.com/?p=293</guid>
		<description><![CDATA[<p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/the-challenges-of-dealing-with-big-data/">The Challenges of Dealing With Big Data</a> </p><p>Big Data poses many challenges to those firms who have to deal with it on a regular basis.  On the flip side,  those who do it well and can use it to feed their analysis and gain insights into their business, customers and market trends will reap huge rewards for their efforts. I have just ...</p></p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/the-challenges-of-dealing-with-big-data/">The Challenges of Dealing With Big Data</a> </p>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/the-challenges-of-dealing-with-big-data/">The Challenges of Dealing With Big Data</a> </p><p>Big Data poses many challenges to those firms who have to deal with it on a regular basis.  On the flip side,  those who do it well and can use it to feed their analysis and gain insights into their business, customers and market trends will reap huge rewards for their efforts.</p>
<p>I have just posted an article which will most likely be the start of a series on this topic over on the Nebility Blog.  The article was inspired by <a href="http://www.semilshah.com">Semil Shah</a> and a Twitter request (<a href="http://twitter.com/semilshah">@semilshah</a>) he made looking for people with Big Data expertise and a subsequent post he pointed me too that he had done discussing <a href="http://semilshah.posterous.com/three-levels-of-value-created-by-big-data" class="broken_link">ways to create value from big data</a>.  I would encourage you to read Semil’s thesis and the article over on <a href="http://www.nebility.com">Nebility</a>.</p>
<p>The complete article can be found <a href="http://www.nebility.com/the-challenges-of-dealing-with-big-data/">here</a>.  Enjoy!</p>
<p>As always, you can reach me through the comments, at <a href="http://www.nebility.com">Nebility</a>, on <a href="http://www.twitter.com/techmusings">Twitter</a>, <a href="http://www.linkedin.com/in/paulkmichaud">LinkedIn</a> or by using the <a href="http://www.technologymusings.com/expert-technology-advice">Ask the Experts</a> form.</p>
<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/the-challenges-of-dealing-with-big-data/">The Challenges of Dealing With Big Data</a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.technologymusings.com/the-challenges-of-dealing-with-big-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<series:name><![CDATA[Challenges When Dealing With Big Data]]></series:name>
	</item>
		<item>
		<title>Introducing Ask The Experts</title>
		<link>http://www.technologymusings.com/introducing-ask-the-experts/</link>
		<comments>http://www.technologymusings.com/introducing-ask-the-experts/#comments</comments>
		<pubDate>Tue, 08 Feb 2011 23:57:19 +0000</pubDate>
		<dc:creator>Paul Michaud</dc:creator>
				<category><![CDATA[Ask The Experts]]></category>
		<category><![CDATA[Software Design]]></category>
		<category><![CDATA[Technology Startups]]></category>
		<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Expert Technology Advice]]></category>
		<category><![CDATA[Startups]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://www.technologymusings.com/?p=258</guid>
		<description><![CDATA[<p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/introducing-ask-the-experts/">Introducing Ask The Experts</a> </p><p>So as we relaunch the Technology Musings blog,  we have added a new feature to it called “Ask The Experts”.  You will find the link to this at the top of the Technology Musings blog. This came about because a lot of people who read this blog have over time tracked me down through Twitter, ...</p></p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/introducing-ask-the-experts/">Introducing Ask The Experts</a> </p>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/introducing-ask-the-experts/">Introducing Ask The Experts</a> </p><p>So as we relaunch the Technology Musings blog,  we have added a new feature to it called “<a href="http://www.technologymusings.com/expert-technology-advice">Ask The Experts</a>”.  You will find the link to this at the top of the Technology Musings blog.</p>
<p><span>This came about because a lot of people who read this blog have over time tracked me down through <a href="http://www.twitter.com/techmusings">Twitter</a>, <a href="http://www.linkedin.com/in/paulkmichaud">LinkedIn</a>, and other means in order to ask me specific question about something of interest to them or about a particular problem they have with something.  As a result we decided that we would add a simple form to the blog which would allow people to ask these questions without having to jump through hoops to get to me and my colleagues.  So we would encourage you to use it to ask your questions.  We can’t  promise to be able to answer them all but we will do our best to at least respond to every request.  Things we would expect to get through this mechanism include, but are not limited to, the following:</span></p>
<ul>
<li><span>Requests for a blog article on a certain topic</span></li>
<li><span>Questions about something we already wrote about but that people felt were better served through a direct question instead of the comments (but we encourage you to use the comments so others can learn as well)</span></li>
<li><span>Specific questions related to some challenge you may be having with a project of your own.  See note below on this one.</span></li>
<li><span>Anything else you folks can come up with</span></li>
</ul>
<p><span>Where appropriate we will try and put the answers to question into a blog post either here or on the Nebility Blog depending on the topic and how it aligns with each blog.  We will of course anonymize anything we use for a blog article and make it more general, unless you give us specific permission to use you name/case as is.</span></p>
<p><span>NOTE: while we will try and answer every request we receive through this “Ask the Experts” mechanism, we cannot guarantee to be able to answer everything either because we don’t know the answer, don’t have time or the question is too specific to someone&#8217;s particular system issue and requires too detailed an answer to respond quickly.  In these cases we will at least try and get back to you to let you know we can’t answer and why. </span></p>
<p>Anyhow, what we are hoping is that people will use this opportunity to ask questions about any topic in the following broad categories:</p>
<ul>
<li>Technology in General</li>
<li>Systems Design and Architecture</li>
<li>Systems Implementation</li>
<li>Enterprise Data</li>
<li>Large Scale Systems</li>
<li>User Experience Techniques and Technologies</li>
<li>Technology Startups</li>
<li>request a blog article on a specific topic</li>
<li>or any other question or request which may fall inside our sphere of expertise</li>
</ul>
<p>So think of the questions you may have that we could possibly answer and start asking away.  We can’t promise to be able to answer every question that comes our way as we may not always have the time to answer every question or the expertise to answer it.  That said, we will try to at least respond to every request as soon as we can.</p>
<p>So go ahead and “<a href="http://www.technologymusings.com/expert-technology-advice">Ask The Experts</a>”.</p>
<p>&nbsp;</p>
<p>As always, you can reach me through the comments, at <a href="http://www.nebility.com">Nebility</a>, on <a href="http://www.twitter.com/techmusings">Twitter</a>, <a href="http://www.linkedin.com/in/paulkmichaud">LinkedIn</a> or by using the <a href="http://www.technologymusings.com/expert-technology-advice">Ask the Experts</a> form.</p>
<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/introducing-ask-the-experts/">Introducing Ask The Experts</a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.technologymusings.com/introducing-ask-the-experts/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Back From The Dead</title>
		<link>http://www.technologymusings.com/back-from-the-dead/</link>
		<comments>http://www.technologymusings.com/back-from-the-dead/#comments</comments>
		<pubDate>Wed, 02 Feb 2011 16:25:00 +0000</pubDate>
		<dc:creator>Paul Michaud</dc:creator>
				<category><![CDATA[Nebility]]></category>
		<category><![CDATA[Service Oriented Architecture (SOA)]]></category>
		<category><![CDATA[Software as a Service]]></category>
		<category><![CDATA[Solution Design]]></category>
		<category><![CDATA[Technology Startups]]></category>
		<category><![CDATA[Technology Strategy]]></category>
		<category><![CDATA[CIO]]></category>
		<category><![CDATA[Technology Startup]]></category>

		<guid isPermaLink="false">http://www.technologymusings.com/?p=225</guid>
		<description><![CDATA[<p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/back-from-the-dead/">Back From The Dead</a> </p><p>Well looking back at the last blog post I did, it's been over a year since I wrote it and I thought I would provide some insight into why the long hiatus.</p></p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/back-from-the-dead/">Back From The Dead</a> </p>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/back-from-the-dead/">Back From The Dead</a> </p><p>Well looking back at the last blog post I did, it&#8217;s been over a year since I wrote it and I thought I would provide some insight into why the long hiatus.</p>
<p>As any of you who read my Bio here on the blog know, I had been working for IBM as Executive IT Architect for Financial Services when I wrote the last post, a role which I really enjoyed. One of the problems with the role though as it pertained to this blog was that I found myself constantly having to abandon articles out of concern about whether someone at IBM would feel it was against the official company party line. Particularly, since a number of IBM executives subscribed to this blog. Now in IBM&#8217;s defense they never gave me any grief over this blog and in fact encouraged myself and others to blog openly so this was more my choice to stop for a while in an attempt to remove any potential conflict of interest.</p>
<p>Another factor that made me stop blogging was the fact that I was already considering starting a new company and was being particularly careful about what I did or wrote during my employ, again to make sure I did not violate any potential IP agreements or other implied contracts. This is an area, if any of you read blogs from Venture Capitalists such as <a href="http://www.feld.com/wp/">Brad Feld</a>, <a href="http://www.bothsidesofthetable.com/">Mark Suster</a> or others, where many founders are not sufficiently careful and can get themselves and their new firm into trouble, before it even starts. So when combining these two issues I felt it was best to take a vacation from blogging until such time as any potential conflict of interest was removed.</p>
<p>Now is that time.</p>
<p>As of the end of September 2010, I have left IBM and started a new company with a couple of senior colleagues of mine, who will be introduced here over time.</p>
<p>The company is called <strong><a href="http://www.nebility.com">Nebility</a></strong> and it will be focused on a couple of areas:</p>
<ol>
<li><span>We will be designing both custom and shrink wrapped commercial software which we will offer as both stand alone and as Software as a Service (SaaS) offerings</span></li>
<li><span>We will be doing some consulting to other firms where it is focused on helping them design and build custom Service Oriented Architecture (SOA) based applications</span></li>
<li><span>We are developing an integrated library of enterprise caliber, prebuilt service components for a range of uses which acts as an accelerator for 1 &amp; 2 and which we are calling <strong><a href="http://www.nebility.com/enterprise-saas-solutions/software-as-a-service-sdk/">NebulaBlocks</a></strong>.</span></li>
<li><span>We are also developing some very advanced, proprietary technology which will help us do items 1, 2 and 3 and which we believe provides us with a game changing competitive advantage. So for now I will remain quiet about it <img class="wlEmoticon wlEmoticon-winkingsmile" style="border-style: none;" src="http://www.technologymusings.com/wp-content/uploads/2011/02/wlEmoticon-winkingsmile.png" alt="Winking smile" width="19" height="19" />.</span></li>
</ol>
<p>So, what does all of this mean for this blog? Well it means:</p>
<ol>
<li>I will be back to blogging and this time with fewer restrictions than before.</li>
<li>We will also be starting a blog on the <strong>Nebilty</strong> web site at <a href="http://www.Nebility.com">www.Nebility.com</a><strong> </strong>and will be dividing articles between the two depending on their focus. More on this to follow.</li>
<li>Some of my colleagues in this new venture will also be blogging on this site and the Nebility blog and will thus provide additional perspective reflecting our individual areas of expertise and specialization</li>
</ol>
<p><span>So you may be wondering how will the two blogs TechnologyMusings.com and the <a href="http://www.nebility.com/enterprise-solutions-done-right/">Nebility.com blog</a> be divided? While we don&#8217;t think there will be a really hard and fast rule our plan is as follows:</span></p>
<ol>
<li>TechnologyMusings.com will cover a range of topics including discussions of technologies including our experiences as we play with new technologies that we are evaluating and using for nebility.  Technology Musings will also have material on startup issues and considerations were are deliberating as we go through this startup journey.  It will also include general rants and wide ranging discussions about whatever strikes our fancy on any given day.</li>
<li>The Nebility.com blog will attempt to focus almost exclusively on business and technology discussions,  how-to type articles at both the design and implementation level (much like I have done here on this blog in the past).  On the Nebility blog we will also discuss a lot about business and technology strategy, Service Oriented Architecture (SOA), Software as a Service (SaaS) and issues and challenges in specific application verticals we are working on.  In addition we will also cover general issues and frustrations experienced by ourselves and our customers with enterprise software and other related topics.</li>
</ol>
<p><span>We will post a link to any articles which are posted on the Nebility bog, here on TechnologyMusings where appropriate, but we would encourage you to subscribe to both blogs to be sure to catch all of the articles.</span></p>
<p><span>In addition, you will notice that there is a new layout to the blog and some new features which we hope will allow you to reach us and interact with us better. </span></p>
<p><span>Well that’s about it for now.  I hope you enjoy the new articles on both blogs.</span></p>
<p><span> </span></p>
<p>As always, you can reach me through the comments, at <a href="http://www.nebility.com">Nebility</a>, on <a href="http://www.twitter.com/techmusings">Twitter</a>, <a href="http://www.linkedin.com/in/paulkmichaud">LinkedIn</a> or by using the <a href="http://www.technologymusings.com/expert-technology-advice">Ask the Experts</a> form.</p>
<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/back-from-the-dead/">Back From The Dead</a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.technologymusings.com/back-from-the-dead/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Consideration For The Technical Implementation of an SOA</title>
		<link>http://www.technologymusings.com/consideration-for-the-technical-implementation-of-an-soa/</link>
		<comments>http://www.technologymusings.com/consideration-for-the-technical-implementation-of-an-soa/#comments</comments>
		<pubDate>Sun, 06 Sep 2009 20:35:42 +0000</pubDate>
		<dc:creator>Paul Michaud</dc:creator>
				<category><![CDATA[Service Oriented Architecture (SOA)]]></category>
		<category><![CDATA[Software Design]]></category>

		<guid isPermaLink="false">http://www.technologymusings.com/?p=152</guid>
		<description><![CDATA[<p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/consideration-for-the-technical-implementation-of-an-soa/">Consideration For The Technical Implementation of an SOA</a> </p><p>System architects and programmers often don't consider the performance needs of their systems ...</p></p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/consideration-for-the-technical-implementation-of-an-soa/">Consideration For The Technical Implementation of an SOA</a> </p>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/consideration-for-the-technical-implementation-of-an-soa/">Consideration For The Technical Implementation of an SOA</a> </p><p>I had a lot of questions from people after my last post on <a title="BPM approach to SOA" href="http://www.technologymusings.com/softwaredesign/why-a-business-process-modeling-bpm-approach-to-soa-usually-fails">BPM and SOA</a> about the layered SOA I proposed and whether it would be slow performance wise.  The answer I gave people was &#8220;It depends&#8221;.  In this post I will outline in more detail some of the considerations needed around performance when implementing an SOA or any system for that matter.</p>
<p>Firstly, I find over the last 10-15 years system architects and programmers often don&#8217;t consider the performance needs of their system enough when designing it.  The process most people follow is typically:</p>
<ol>
<li>Code the system up (or at least a prototype)</li>
<li>Run a small performance benchmark</li>
<li>Size the hardware to get the desired performance</li>
</ol>
<p>To be honest this drives me crazy as it often yields a very non-performent system and if the performance is not achieved with hardware at step 3 then its too late in most cases to do anything about it without going back and severely refactoring the code or worse yet rewriting it all together.</p>
<p>Years ago this was not the case because the hardware was too slow to assume you could easily find a big enough box to ensure you achieved your performance goals.  As a result programmers pre about 1993 thought very carefully about how they designed and implemented their code with performance front of mind before even a single function got coded.  But once the bigger SMP UNIX boxes started coming out, programmers got sloppy to the point where today most of the younger programmers (those who started commercial programming post say 1997 <img src='http://www.technologymusings.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' />  ) I come across, have not been taught how to evaluate the effects of their technology or programming decisions, when it comes to impacting performance.</p>
<p>In the old days programmers were taught that when implementing any function or procedure to consider the Order &#8220;O&#8221; of their algorithm.  We worried about whether it was Order N, N Log N, N^2 or God forbid N^3 or worse.  If we could replace an O(N^2) algorithm with one that was O(N Log N) we did.  The same mind set should be employed today as well when writing code, but more importantly it should be employed at a higher level when deciding the language, interface types, message formats, I/O pattern, network topology and database design.</p>
<p>Let me explain:</p>
<p>Consider our Layered SOA from the last post (shown again here for reference):</p>
<p><img class="aligncenter size-full wp-image-153" title="Point Of Sale" src="http://www.technologymusings.com/wp-content/uploads/2009/09/Point-Of-Sale1.JPG" alt="Point Of Sale" width="1371" height="561" /></p>
<p>This is in principle a logical architecture which we can choose to realize in a number of ways and with a range of languages, interface protocols, hardware choices, network topologies, etc.</p>
<p>So the first thing you need to consider at a high level is whether you are designing the system for throughput, latency/response time or both.  This is important, because depending on what the system needs to achieve, you should be making potentially dramatically different technology choices.  For example, if I am building an ultra high performance stock exchange system that does 5 million transactions per second and has a response time target end to end of 100 microseconds, then certain technologies are out right off the bat They include but are not limited to:</p>
<ul>
<li>I can&#8217;t use Java or c# as the languages themselves and the implementations of the base language libraries do not lend themselves to low latency.  I could use them to achieve the high throughput but I will not get to 100 microseconds as a rule (there are some ways to get close by using Java like C, pre-compiling, ensuring no garbage collection is used, etc but its a pain)</li>
<li>I can&#8217;t build it on windows at all, because of the overhead and the slow network stack</li>
<li>I can&#8217;t use persistent queue based messaging</li>
<li>No traditional database transactions in the critical path (in fact no on disk I/O at all)</li>
<li>No XML anywhere</li>
<li>And definitely no REST or SOAP Web Service Interfaces</li>
</ul>
<p>The key is that while all of these technologies can be used in a high throughput system,  they are inherently not fast from a latency or response time perspective, so if you choose to use them for a system that needs ultra fast response times, you are dead from the word go because no amount of hardware will fix it.<br />
It is important to keep in mind the time it takes for basic operations as well so that you can roughly gauge in advance what your system will be capable of.  Here is a list of rough performance measures for common operations.  Your mileage will vary based on specifics such as hardware, compiler, database, etc but the order of magnitude &#8220;O&#8221; will be about right regardless. These are approximations based on say a 2.8 GHz Xeon. Newer Nehalems, etc will do better.  Also these are for a single thread on a single core.  Most won&#8217;t benefit from multithreading.  Also keep in mind that to an extent the lower the latency or response time of the system the higher throughput it can handle per core as CPU&#8217;s free up faster, so getting this right is important for all systems.   So here they are:</p>
<ul>
<li>Network hop from desktop to remote web server                                              50-500 milliseconds</li>
<li>Persistent Message (small) per hop                                                                       15-30 milliseconds</li>
<li>Non-Persistent Queue based messaging (small) per hop                                2-5 milliseconds</li>
<li>Database Insert (Complex)                                                                                     15-30 milliseconds</li>
<li>Database Insert (Simple)                                                                                        3-10 milliseconds</li>
<li>Database Select (Complex)                                                                                    10+ milliseconds</li>
<li>Database Select (Simple)                                                                                       500 microseconds-3 milliseconds</li>
<li>Binary Write to Traditional Disk                                                                        2-5 milliseconds</li>
<li>Binary write to SSD                                                                                                25 microseconds</li>
<li>Screen Refresh                                                                                                          10-15 milliseconds</li>
<li>Web Service Call inside a Web Server (Small Payload Java or C#)         1-5 milliseconds</li>
<li>Web Service Call using GSOAP or Systinet (C no server)                           200-400 microseconds</li>
<li>Binary RPC Call (small payload Java or C#)                                                 50-100 microseconds (local machine plus network roundtrip if remote)</li>
<li>Binary message based function call (Small, C , Infiniband)                     1-2 microseconds in shared memory space, 5-10 microseconds remote through a switch</li>
</ul>
<p>Anyhow this list is by no means exhaustive.  Response times will increase with the size of the message payload, amount of XML to serialize/deserialize, complexity of the database query, etc.  What&#8217;s important to the system designer is the order of magnitude of the performance given the non-functional targets for the system.  Knowing what is possible for a given operation or technology should shape both your design and technology selection.</p>
<p>So armed with this,  lets revisit the questions I got re the performance impact of the layered approach vs a non-layered one to see what happens.  Let assume the following:</p>
<ul>
<li>We implemented the physical architecture exactly as the Logical one is laid out with each service component on a separate machine</li>
<li>We used SOAP web services for every public function call</li>
<li>Assume 1 Gigabit Ethernet networking in the data center (worst case)</li>
</ul>
<p>So lets look at the effects of layering on the Customer Service which adds an extra layer in the proposed design.</p>
<p>Lets assume we just want to retrieve a basic customer record.  In a single layer design with one database behind the service we have the following main costs:<br />
Network Hop Application to Customer Service                                                                  200-500 microseconds<br />
Find Customer Web Service Call                                                                                            1-5 milliseconds<br />
Complex query doing a join across multiple tables for name, addresses, etc           10 milliseconds<br />
Internal Logic Code execution                                                                                                Implementation and Process dependant</p>
<p>Now lets look at the layered costs:<br />
Network Hop Application to Customer Service                                                                200-500 microseconds<br />
Find Customer Web Service Call                                                                                          1-5 milliseconds<br />
Parallel Network round trips                                                                                                200-500 microseconds<br />
Second Layer Parallel Web Service Calls                                                                          1-3 milliseconds<br />
Parallel Simple Queries                                                                                                          500usec &#8211; 3 milliseconds<br />
Internal Logic Code execution                                                                                              Implementation and Process dependant</p>
<p>And I would contend that you can do better by using a binary call for internal Component to Component calls, reducing the 1-5 milliseconds down to more like 50-100 microseconds.</p>
<p>So if you add it up,  you will see that not only did the layered approach potentially improve total end to end latency but it definitely improved throughput of the system.  You should now be scratching your head wondering how that happened.  There is at least one major assumption here and that is that each Service Component had its own independent database.  This allowed the queries in the Layered approach to happen in parallel and the queries are themselves dramatically simpler with potentially no joins, etc.  If we still have one single database,  then performance will be bottlenecked there are we won&#8217;t see as much gain.  Even then at worst, if the queries take the same 10 milliseconds, we added about 1-3 milliseconds to the total end to end time.  Depending on the amount of time spent in the implementation specific code, that&#8217;s about 10-20% slower latency worst case (and still less than the cost of a single screen refresh so a user won&#8217;t notice it unless its enough to cause the total system to queue up throughput wise) but in return we got much better throughput and much more reuse.</p>
<p>Anyhow,  the point is, when designing a system and coding it,  you really should think about these types of things before you get so far down the road that you realize you have a problem only when you&#8217;re in it up to your neck and can&#8217;t easily do anything about it.</p>
<p>Feel free to fire away with the questions here or on Twitter (<a title="@TechMusings" href="http://twitter.com/techmusings">@TechMusings</a>)</p>
<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/consideration-for-the-technical-implementation-of-an-soa/">Consideration For The Technical Implementation of an SOA</a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.technologymusings.com/consideration-for-the-technical-implementation-of-an-soa/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Why a Business Process Modeling (BPM) Approach to SOA Usually Fails</title>
		<link>http://www.technologymusings.com/why-a-business-process-modeling-bpm-approach-to-soa-usually-fails/</link>
		<comments>http://www.technologymusings.com/why-a-business-process-modeling-bpm-approach-to-soa-usually-fails/#comments</comments>
		<pubDate>Thu, 03 Sep 2009 13:16:39 +0000</pubDate>
		<dc:creator>Paul Michaud</dc:creator>
				<category><![CDATA[Service Oriented Architecture (SOA)]]></category>
		<category><![CDATA[Software Design]]></category>

		<guid isPermaLink="false">http://www.technologymusings.com/?p=143</guid>
		<description><![CDATA[<p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/why-a-business-process-modeling-bpm-approach-to-soa-usually-fails/">Why a Business Process Modeling (BPM) Approach to SOA Usually Fails</a> </p><p>I find that when people take a BPM centric approach to SOA, it usually ends up not delivering the goods. </p></p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/why-a-business-process-modeling-bpm-approach-to-soa-usually-fails/">Why a Business Process Modeling (BPM) Approach to SOA Usually Fails</a> </p>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/why-a-business-process-modeling-bpm-approach-to-soa-usually-fails/">Why a Business Process Modeling (BPM) Approach to SOA Usually Fails</a> </p><p>I was having a Twitter conversation with Brenda Michelson (@bmichelson) and Todd Biske (@toddbiske) about the tight coupling in peoples minds between BPM and SOA, and why I find that when people take a BPM centric approach to SOA, it usually ends up not delivering the goods.  So today&#8217;s post is about how to properly layer your SOA to include BPM while still yielding all the flexibility and reuse that is the promise of a well done SOA.</p>
<p>For this discussion I will build on the terminology defined in the post entitled &#8220;Anatomy of a Service Oriented Architecture&#8221;.  I will also use 2 simplified application examples to illustrate some of the pitfalls.</p>
<p>So here is our base example.  Imagine that you need to build  a very simplified Point of Sale System.  It needs to be able to log people in, update customer records and perform a transaction, including updating inventory.  For the sake of argument, I will propose it should look something like this (not 100% accurate but best I could do in 20 minutes).  Sorry the pictures are a bit hard to read.  I need to change templates to get something wider.</p>
<p><img class="aligncenter size-full wp-image-144" title="Point Of Sale" src="http://www.technologymusings.com/wp-content/uploads/2009/09/Point-Of-Sale.JPG" alt="Point Of Sale" width="1148" height="561" /></p>
<p>Furthermore, I will contend that people following a BPM centric approach to SOA are usually likely to design it like this:</p>
<p><img class="aligncenter size-full wp-image-145" title="Point Of Sale-BPM" src="http://www.technologymusings.com/wp-content/uploads/2009/09/Point-Of-Sale-BPM.JPG" alt="Point Of Sale-BPM" width="869" height="461" /></p>
<p>and that too often with SOA design we see something like this (which is really bad)</p>
<p><img class="aligncenter size-full wp-image-146" title="Point Of Sale-BAD" src="http://www.technologymusings.com/wp-content/uploads/2009/09/Point-Of-Sale-BAD.JPG" alt="Point Of Sale-BAD" width="428" height="478" /></p>
<p>So lets look at these starting from the bottom one which, sad to say represents about 75+% of &#8220;SOA&#8221; implementations I tend to see.</p>
<p><strong>Point of Sale &#8211; Badly Done Case:</strong><br />
This last example is what you get when people decide they need to be SOA, but frankly have no idea how to decompose a problem or do component based design.  It is also what you get when people decide SOA means you just slap a web service interface onto a monolithic legacy app so you can call it SOA to please senior management who have heard all the buzz, and decided they just have to have an SOA by next week to save cost, etc.  This approach buys the company absolutely nothing and in fact only slows down the performance of the existing system and certainly provides no flexibility and reuse at all.</p>
<p><strong>Point of Sale BPM Centric Case:</strong><br />
In the BPM case we will have identified our main processes:</p>
<ol>
<li>Customer Lookups, etc</li>
<li>Authenticating a User</li>
<li>Processing a Transaction</li>
</ol>
<p>We would then set about to design service interfaces which align to those and they would look something like this:</p>
<p><strong>1. Customer Service:</strong></p>
<p>1.1 Create Customer<br />
1.2 Update Customer<br />
1.3 Lookup Customer<br />
<strong>2. Authentication:</strong></p>
<p>2.1 Create User<br />
2.1 Update User<br />
2.3 Delete User<br />
2.4 Login<br />
3. <strong>Transaction:</strong></p>
<p>3.1 Record Transaction<br />
3.2 Void Transaction</p>
<p>Each of those service interfaces more often than not end up being tightly coupled to a back end database which exposes stored procedures which are also tightly aligned to those process services. Its basically a client server design behind the service interfaces, and while I drew them as 3 separate databases supporting each service,  more often then not I will see no decoupling under the covers and it will be one big database.<br />
Now, in this case we have a better design (but still not good) which shows some process orientation and some componentization. This design will offer some reuse, albeit limited.  In addition this system will suffer from some data duplication (consider Paul Michaud as User and Paul Michaud as Customer.  This design would store the records twice for Paul Michaud) which is not desirable in a good SOA.</p>
<p><strong>Point of Sale &#8211; Layered Case:</strong><br />
I will contend that this design is optimal (or at least as optimal as I could draw in 20 minutes).  It provides for the same level of BPM support,  but does not require any data duplication and allows for the maximum reuse.  I am not going to go into details on how one would come up with this design in this post but I will come back to that in another post soon.</p>
<p>To prove the point, lets now consider the need to create a second application to do simple contact management.  We would ideally like to reuse our SOA and layer the new application on top of what we already built.  I will contend that for this second application only the layered design would offer any reusable components for this second application.  I will propose the following simple design for the Contact Management Application using SOA.</p>
<p><strong>Contact Management:</strong></p>
<p><img class="aligncenter size-full wp-image-147" title="Contact Management Simple" src="http://www.technologymusings.com/wp-content/uploads/2009/09/Contact-Management-Simple.JPG" alt="Contact Management Simple" width="500" height="450" /></p>
<p>Notice that we are using the party service not a Customer Service or anything else.  A contact may be a Company, Organization, Church, Personal friend, etc.  If we had only build Customer Services as in design 2 &amp; 3, which is what the BPM approach would have identified we would not have had a foundation on which to build this second application (unless that second application was very similar to the first one).</p>
<p>Anyhow,  the point is that by focusing on the process we usually fail to identify the necessary Foundation Services or the Fundamental Data Objects on which the SOA should be operating.  I will elaborate on this further in a later post.</p>
<p>As always feel free to leave comments here or on Twitter.com (@techmusings).</p>
<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/why-a-business-process-modeling-bpm-approach-to-soa-usually-fails/">Why a Business Process Modeling (BPM) Approach to SOA Usually Fails</a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.technologymusings.com/why-a-business-process-modeling-bpm-approach-to-soa-usually-fails/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>The Need For Speed</title>
		<link>http://www.technologymusings.com/the-need-for-speed/</link>
		<comments>http://www.technologymusings.com/the-need-for-speed/#comments</comments>
		<pubDate>Wed, 26 Aug 2009 19:12:58 +0000</pubDate>
		<dc:creator>Paul Michaud</dc:creator>
				<category><![CDATA[High Performance Computing]]></category>

		<guid isPermaLink="false">http://www.technologymusings.com/?p=131</guid>
		<description><![CDATA[<p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/the-need-for-speed/">The Need For Speed</a> </p><p>Yesterday was a busy day for me.  It started at 4:30 AM when I had to do an interview with a reporter from Bloomberg who covers the European Stock Exchanges.  There was then coverage of the goings on with some of my clients in the Wall Street Journal, Financial Times, and many other papers, which ...</p></p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/the-need-for-speed/">The Need For Speed</a> </p>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/the-need-for-speed/">The Need For Speed</a> </p><p>Yesterday was a busy day for me.  It started at 4:30 AM when I had to do an interview with a reporter from Bloomberg who covers the European Stock Exchanges.  There was then coverage of the goings on with some of my clients in the Wall Street Journal, Financial Times, and many other papers, which kept me fielding questions most of the day (and I have more reporter interviews today as well).  This interest in the exchanges spilled over onto Twitter as a result of people commenting on the coverage on CNBC.  As a result of that,  I thought I would talk a bit about the trends and challenges faced by the World&#8217;s Investment Banks, Hedge Funds and Exchanges as they grapple with their Need For Speed as their seemed to be interest by many of my followers both here and on Twitter.</p>
<p>So what&#8217;s driving the system designs at today Financial Market firms.  Well, amongst other factors such as cost reduction, operational efficiency, and the other usual IT issues, is an ever increasing need for speed.  Let me give you some background.</p>
<ul>
<li>If you&#8217;re Twitter, you handle about 6 million Tweets per day and peak out at about 200 or so Tweets per second based on what info I have seen.</li>
<li>If your a Stock Exchange 10 years ago, you did a peak of a few thousand transactions per second</li>
<li>A big credit card processor handles a few tens of thousands of transactions per second at peak</li>
<li>Today&#8217;s Stock exchanges are building systems capable of handling millions of transactions and quotes per second</li>
</ul>
<p>Not only are the throughput requirements exploding, but the response times or latency tolerance is approaching Zero at an alarming rate.  Again for comparison:</p>
<ul>
<li>A Telecom system handles a few hundred thousand calls per second but can take a few seconds until the first ring without people complaining</li>
<li>Twitter, which is considered &#8220;real time&#8221; also takes a few seconds to acknowledge and publish a Tweet (at best)</li>
<li>A credit card processor also can be a few seconds to respond</li>
<li>5 Years ago the NYSE would take about 40 milliseconds to process a trade</li>
<li>Today cutting edge exchanges are building systems which can process a trade in under 100 microseconds (yes that&#8217;s micro, not milli and definitely not seconds).</li>
</ul>
<p>On the flip side,  Banks and Hedge Funds with their algorithmic trading systems are sending trades into these exchanges at volumes which for stocks is increasing at over 50% per year and for options at well over 100% per year, so these system need to scale.  In addition those same firms are monitoring the response times of each exchange and if they see one being slower than another, the trade gets routed to the faster exchange wherever possible.  The drive toward zero latency is causing a lot of traditional exchanges who had legacy systems such as the New York Stock Exchange, London Stock Exchange and Deutsche Boerse (three of the biggest), to lose market share and as a result they are all undergoing radical redesigns and deploying new cutting edge systems.</p>
<p>In addition to all of this speed,  keep in mind that the reliability levels on these systems are ultra high as well.  We design for 99.9999% uptime with absolutely zero loss of messages.  Its that need for high reliability levels coupled with the speed that makes building these systems such a challenge.  Making things fast without being reliable is easy. Making things reliable without being fast is also easy.  Bringing them both together is very difficult and requires radical new system designs.</p>
<p>Just think of some of the challenges you face.</p>
<ul>
<li>How do you log transactions to a database when a physical hard drive takes 2 milliseconds to do a write</li>
<li>You can&#8217;t do database transactions in the critical transaction path because any database operations kill you for throughput and latency</li>
<li>You need to be highly horizontally scalable to handle the constant growth in transaction volume</li>
<li>You need to be running redundant hot/hot configurations for failover because the system reliability target is higher than that of any single component, components will fail,  but the system must stay up and not lose a beat</li>
<li>After 9/11 the Disaster Recovery mechanism has to keep the system up and running even during and after a 9/11 type event with no loss of messages or down time</li>
<li>How do you record all the trades.  A trade record is typically pretty small, between 120 and 400 bytes, not much bigger than a Tweet on Twitter.  People are always seem amazed that Twitter needs to store a few 10&#8242;s of GB per day.  Well with a modern exchange system we log about 1.2 TB per day and we need to keep 5 years of that searchable on main storage without going to tape, etc.</li>
</ul>
<p>In these systems every microsecond counts.  As such even network path length is measurable and directly impacts trading profits for banks and hedge funds.  As a result firms will move their trading systems directly into Collocation facilities offered by the exchanges so as to have the absolute minimum latency as a result of the network itself.  Networks in these facilities are going from 1GigE networks to 40 Gig Infiniband and/or 10GigE (which is slower than IB for both throughput and latency).  The NYSE is putting in 100GB Fibre Optic Switches for their WAN.  They use ultra high speed messaging software on top of that network, solid state disks with SAN backups, the fastest processors with the highest I/O.  Every line of code needs to be extra tight, with Matching Engines in the exchanges typically being single threaded and using an MPP design instead of SMP because even the cost of a thread context switch is measurable and impacts profits.</p>
<p>Its a very tough set of criteria to meet and poses some very unique challenges.  I&#8217;m proud to say that currently the fastest of the new systems coming to market are ones I helped design and using technology I helped IBM develop, so I guess we&#8217;re doing something right.</p>
<p>If people have any questions on this or any of my other posts, please use the comments or reach me on Twitter at <a title="@TechMusings" href="http://twitter.com/techmusings">twitter.com/techmusings</a> (@techmusings)</p>
<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/the-need-for-speed/">The Need For Speed</a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.technologymusings.com/the-need-for-speed/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>How To Build an SOA Based, High Performance, Scalable and Reliable Twitter on Steroids</title>
		<link>http://www.technologymusings.com/how-to-build-an-soa-based-high-performance-scalable-and-reliable-twitter-on-steroids/</link>
		<comments>http://www.technologymusings.com/how-to-build-an-soa-based-high-performance-scalable-and-reliable-twitter-on-steroids/#comments</comments>
		<pubDate>Thu, 20 Aug 2009 17:30:07 +0000</pubDate>
		<dc:creator>Paul Michaud</dc:creator>
				<category><![CDATA[High Availability (HA)]]></category>
		<category><![CDATA[Service Oriented Architecture (SOA)]]></category>
		<category><![CDATA[Software Design]]></category>

		<guid isPermaLink="false">http://www.technologymusings.com/?p=109</guid>
		<description><![CDATA[<p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/how-to-build-an-soa-based-high-performance-scalable-and-reliable-twitter-on-steroids/">How To Build an SOA Based, High Performance, Scalable and Reliable Twitter on Steroids</a> </p><p>Over the past few days I have been having some issues with my Twitter account.  Beyond the well known pauses in the service, outages, etc there are some less known but more annoying problems with twitter search.  It turns out that many accounts don&#8217;t show up in search at all.  Therefore, if you are one ...</p></p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/how-to-build-an-soa-based-high-performance-scalable-and-reliable-twitter-on-steroids/">How To Build an SOA Based, High Performance, Scalable and Reliable Twitter on Steroids</a> </p>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/how-to-build-an-soa-based-high-performance-scalable-and-reliable-twitter-on-steroids/">How To Build an SOA Based, High Performance, Scalable and Reliable Twitter on Steroids</a> </p><p>Over the past few days I have been having some issues with my Twitter account.  Beyond the well known pauses in the service, outages, etc there are some less known but more annoying problems with twitter search.  It turns out that many accounts don&#8217;t show up in search at all.  Therefore, if you are one of those lucky accounts, no one other than direct followers can see your tweets and no one can find you or any of your Tweets.  This makes the accounts pretty useless.  It also turns out its been a know issue with no fix for over a year other than to create a new account and tweet with that.  Well it turns out that my account was one such account which needless to say was very annoying and cost me 2 days of my time trying to figure out a viable work around.  As a result, Twitter earned the place of honor in today&#8217;s blog.</p>
<p>Now in the defense of Evan Williams, Biz Stone and the rest of the gang at Twitter, they find themselves in the enviable position of having a hugely successful product on their hands which has no doubt outpaced their wildest growth projections over the past few years and thus put stress on their design and everything else.  I on the other hand have the advantage of 20/20 hindsight and thus in this blog we can design Twitter on Steroids from scratch using technology that was not even available when Twitter was conceived.  I know the Team at twitter is busting their butts to keep up with their phenomenal growth and my hats of to them for their success.</p>
<p>So for those who have not read my Bio, I have been designing and building ultra high performance systems for the World&#8217;s Largest Banks and Stock Exchanges for about 25 years.  Just this June a couple of colleagues of mine and I designed and ran a stock exchange prototype system capable of 4.5 million transaction per second with round trip response time as low as 15 microseconds (yes that&#8217;s microseconds for multiple network hops, I/O, parsing, matching and the whole shebang, everything the NYSE does to tell you that you just bought 100 shares of IBM).  We also showed this system can scale linearly for throughput by adding hardware, was fully fault tolerant and could do dynamic load balancing if traffic at the exchange spiked.  In this design, I will be leveraging the lessons learned over that 25 years and the technologies used for the system above.</p>
<p>So lets dive in.</p>
<p><strong>Requirements:</strong><br />
So what does our Twitter on Steroids need to do.  Here is my overly simplistic list of requirements (I am only going to deal with the big ones):</p>
<p><strong>Functional Requirements:</strong><br />
The system shall allow users to create accounts.<br />
The system must provide a means for users to submit Tweets<br />
The System must persist those Tweets<br />
Users shall be able to follow other users Tweets<br />
The system shall provide a mechanism to search Tweets</p>
<p><strong>Non-Functional Requirements:</strong><br />
The system shall be highly responsive<br />
The system shall maintain response times even under load<br />
The system shall be highly scalable<br />
The system shall be highly available with 99.999% or better uptime (its doable)</p>
<p><strong>Where Do We Start?</strong></p>
<p>First some design principles:</p>
<ul>
<li>We will use a componentized SOA design</li>
<li>The Twitter Web Site will use the same Service API that is exposed publicly</li>
<li>The System will use a Hot/Hot High Availability Model based on component replication for reliability</li>
<li>All Service Components will be implemented in a manner that ensure deterministic behaviour (Easiest way to do that, but not the only way, is to make it single threaded which is what we do for most exchange systems.  Thread context switches are expensive at speed and multithreading can result in coherency issues which Twitter seems to be suffering from based on the comments on their support site)</li>
<li>To the maximum extent possible all I/O, remote Service calls, etc will be asynchronous</li>
<li>All internal communication will be message based using multicast for efficiency</li>
</ul>
<p><strong>About the Technology</strong><br />
I don&#8217;t normally like to reference specific technologies in my blog but in this case I am going to as there are a couple which provide unique capabilities to implement this system design, and which people are probably not familiar with.  Apologies in advance for the product plug.  They are as follows:</p>
<p><strong>Websphere MQ Low Latency Messaging (LLM):</strong><br />
LLM is a unique high performance messaging product that has some purpose built capabilities specifically designed for ultra high throughput, low latency, transactional systems.  For one it&#8217;s the fastest messaging available on the market, capable of throughput in excess of 9 million Tweets per second per connection, and latency application to application across a switch as low as 3 microseconds with Infiniband Networking and about 12 microseconds with 10Gbe.<br />
More important than its speed for this type of application though is its unique high availability mechanisms.  LLM provides a unique mechanism that allows me to deliver messages to a primary and secondary Service Component at speed, while maintaining total order across all receivers.  In addition it provides unique mechanisms to perform failure detection, failover, state synchronization and component replication all at speed.  In exchange systems,  LLM has detected and failed over from a primary system to a backup in as little as 7 milliseconds, with no loss of messages or duplication and no system level down time even though a component failed.</p>
<p><strong>Datapower XM70:</strong><br />
This is an appliance that was originally designed for Web Service and Web Edge Security.  This model is specifically enabled to work with LLM above.  It will allow us to expose REST or SOAP based services and convert them to message based for internal consumption.  The XM70 can also do content based routing, parsing and transformation for us on the fly at wire speed taking load of the back end Service Components.</p>
<p><strong>XIV Storage:</strong><br />
This is a low cost storage appliance that has great throughput and reliability.  I have been able to sustain write speeds with this in excess of 5.5 Gb per second per intel box writing to it.</p>
<p>The rest we can use pretty commodity stuff.  The disk above can also be easily swapped for your preferred flavour,  this one just has great price performance.</p>
<p><strong>What Does Twitter on Steroids Look Like?</strong></p>
<p>My version of Twitter on Steroids would look like this (except I didn&#8217;t have room on the drawing to add the Account Management Service Componets or the Follower Service Components, so just imagine they are in the diagram and follow the same pattern <img src='http://www.technologymusings.com/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' />  ):</p>
<div id="attachment_111" class="wp-caption aligncenter" style="width: 631px"><img class="size-full wp-image-111" title="Twitter On Steroids" src="http://www.technologymusings.com/wp-content/uploads/2009/08/TwitterOnSteroids.jpg" alt="Twitter on Steroids" width="621" height="839" />
<p class="wp-caption-text">Twitter on Steroids</p>
</div>
<p>So let&#8217;s walk through this diagram.</p>
<ol>
<li>Firstly we are using Big IP to load balance across the Web Servers and also across the Datapower appliances.  This is pretty standard Web design no surprises.  The BIG IP could also do this to a remote backup site as well if configured correctly, where we could twin this setup for failover or load balancing.  Or we could put the Instance 2&#8242;s in the second site.  It just depends on the SLA&#8217;s you are trying to meet.  The logical design and coding would not change regardless.</li>
<li>The Web Servers are making calls through the Datapower to the back end Services Components just like the external API calls.  This ensures consistent behaviour and reduces the need to test and maintain two API&#8217;s</li>
<li>Datapower is converting all REST and SOAP payload into messages on top of LLM</li>
<li>This is important.  Datapower is multicasting all messages out of the appliance using LLM&#8217;s high availability mechanisms.  It is also putting those messages on different topics based on the content of the message.  I am suggesting partitioning the incoming Tweets based on the first few letters of the Tweeter&#8217;s ID.  The first 2 letters will do to start giving us 676 topics to work with for load balancing.  We can add more topics for finer partitioning later if need be.</li>
<li>LLM is delivering the messages throughout the systems and also providing all the reliability.  It handles NAK&#8217;s and ACK&#8217;s automatically, retransmissions, etc to asssure messages get where they need to be without any additional work by the application.</li>
<li>Tweets are first picked up by the Tweet Capture Service Components.  Each partition subscribes to and handles a subset of the topics in order to provide load balancing.  It is possible to add an external system which monitors load per topic and dynamically changes the subscriptions to adjust load.  Also by partitioning, we can use multiple databases in parallel thus eliminating the databases as a bottleneck, throughput wise.</li>
<li>I/O, in the Tweet Capture Service Components, is Asynchronous providing very fast response times.  We can batch write the tweets for higher throughput and because we do compoent replication using LLM,  if the primary Instance 1 fails, Instance 2 just takes over where it left off with no loss of messages or duplication.</li>
<li>The Tweet Capture rebroadcasts (multicast) all messages to the Tweet Indexing Service Component.  These are also twinned for High Availability and Partitioned for Scalability.  The indexing component does as the name says and indexes into the tweets and stores a record in a database.  I would recommend an in memory database be used with a traditional database behind it, with bi-directional synchronization of current data between the two.  SolidDB/DB2 is one pair or possibly TimesTen/Oracle is another (but the latter pair is slower).  I/O would be batched and asynchronous again for speed.</li>
<li>When a search request comes in, it would be routed by Datapower to the Search Service Components, which would then query the Indexing Service and receive back the matching records for each key word in the search.  A fast parallel algorithm would then be used to handle any &#8220;or&#8221; or &#8220;and&#8221; statements in the search</li>
<li>These results would be returned to the caller via the datapower box as a response to the original service call.</li>
</ol>
<p><strong>So how fast would this be and how big could it scale?</strong></p>
<p>Well this is just a guess based on my experience and without ever having looked at any specific search algorithms that might be used by Twitter.  Lets assume we write everything in C behind the Datapower for speed and stability and that we use 1Gbe for networking which is the slowest at about 27 microseconds per hop.  All latencies are round trip to and from the Datapower Box.</p>
<ol>
<li>I think for Tweet Capture, we could achieve round trip latency per tweet of about 50-60 microseconds with throughput per partition somewhere in the 100,000-200,000 thousand Tweets per second range if using a fast database and some solid state disk for the database log files, etc.  Even higher if a custom binary file system is used (15,000,000+ Tweets per second which have done with stock orders with similar sized messages)</li>
<li>Similar performance is possible for the Tweet Indexing per partition to that of the Capture</li>
<li>For Tweet Search it a bit tougher to gauge, but I woudl guess it would be about 100-150 microseconds per search depending on the algorithm used.  Throughput should also be well into the 100&#8242;s of thousands per second if in mkemory databases are used.</li>
<li>Response times could be reduced by as much as 25 microseconds per network hop by using Infiniband networking instead of 1Gbe</li>
<li>From a scaling poerspective,  this should be able to scale linearly by adding hardware almost without limit (only limited by the avilable network bandwidth)</li>
</ol>
<p>Now clearly,  this is a simplified case and I am sure there are lots of design details we are missing but I think you get the idea.  A bigger, badder Twitter (or any other app for that matter) is definately possible and by using the SOA pattern, Async I/O, component replication, etc we can do this to almost anything.  So if anyone from Twitter (or any one else for that matter) wants to talk specifics or other examples feel free to leave a comment or reach me on Twitter (@techmusings) any time.</p>
<p>Sorry for picking on Twitter they just seemed like a good example given my struggles.  We all wish we had the &#8220;problems&#8221; that come with such a huge success.</p>
<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/how-to-build-an-soa-based-high-performance-scalable-and-reliable-twitter-on-steroids/">How To Build an SOA Based, High Performance, Scalable and Reliable Twitter on Steroids</a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.technologymusings.com/how-to-build-an-soa-based-high-performance-scalable-and-reliable-twitter-on-steroids/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>What&#8217;s in a Cloud (or Not)</title>
		<link>http://www.technologymusings.com/whats-in-a-cloud-or-not/</link>
		<comments>http://www.technologymusings.com/whats-in-a-cloud-or-not/#comments</comments>
		<pubDate>Tue, 18 Aug 2009 20:17:36 +0000</pubDate>
		<dc:creator>Paul Michaud</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Service Oriented Architecture (SOA)]]></category>
		<category><![CDATA[Software Design]]></category>

		<guid isPermaLink="false">http://www.technologymusings.com/?p=106</guid>
		<description><![CDATA[<p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/whats-in-a-cloud-or-not/">What&#8217;s in a Cloud (or Not)</a> </p><p>I read a lot of articles on technology and it always amazes me the degree of heated debate that goes on in the blogosphere, social media and elsewhere over simple definitions.  What caught my attention today was the number of posts and comments on Twitter about what was or was not Cloud. So the question ...</p></p><p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/whats-in-a-cloud-or-not/">What&#8217;s in a Cloud (or Not)</a> </p>]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/whats-in-a-cloud-or-not/">What&#8217;s in a Cloud (or Not)</a> </p><p>I read a lot of articles on technology and it always amazes me the degree of heated debate that goes on in the blogosphere, social media and elsewhere over simple definitions.  What caught my attention today was the number of posts and comments on Twitter about what was or was not Cloud.</p>
<p><strong>So the question is: What is Cloud? </strong></p>
<p>The reality is there is no agreement on this point, so I offer up my own view on this matter for debate.  Feel free to flame away.</p>
<p><strong>The Paul Michaud Definition of Cloud Computing</strong><br />
Any application which can be deployed and scaled (preferably dynamically) against a, potentially globally, distributed cluster of, homogeneous or heterogeneous, compute resources is a Cloud based application.</p>
<p>So what&#8217;s my point?  The point is that almost anything is potentially Cloud based by that definition.  Let&#8217;s look at some examples that were being tossed about today on Twitter and the Blogosphere.</p>
<p>They were:</p>
<ul>
<li>JPMC&#8217;s internal server cluster</li>
<li>Google&#8217;s Cluster</li>
<li>Facebook&#8217;s Clusters</li>
</ul>
<p>James Watters in his &#8220;<a title="Not So Fast Public CLoud: Big Players Still Run Provately" href="http://siliconangle.com/ver2/2009/08/17/not-so-fast-public-cloud-big-players-still-run-privately/">Not So Fast Public Cloud: Big Players Still Run Privately</a>&#8221; contends that&#8217;s JPMC&#8217;s cluster of servers represent an internal Cloud.  James then took some heat from others claiming that a dedicated internal cluster is not Cloud.  The argument then extended to bring in Google and the argument was that it is also a dedicated internal cluster and not cloud, but that Facebooks cluster is a Cloud because they openly admitted to using Hadoop to some extent.</p>
<p>For the record, I think this whole Internal Cluster/ External Cloud debate is all nonsense.  To be honest all of the systems listed above are Cloud in my opinion.  All of them allow for dynamic deployment of processing load against a distributed cluster of compute resources.  From the perspective of the company owning the cluster, its an Internal Cloud.  Once they open it up by providing a public interface into those resources, then its a public cloud resource from the standpoint of an external user of those resources.</p>
<p>Cloud is not the sole property of our latest Web 2.0 startups.  It&#8217;s not a function of some particular piece of software that we collectively decide is &#8220;Cloud&#8221; like Hadoop.  Cloud is a design pattern and a business choice to allow us to take advantage of vast compute resources of all kinds in a more dynamic, efficient and cost effective manner, period.  Furthermore, to effectively use Cloud resources I think you ideally need to be SOA.</p>
<p>Let the Flaming begin.</p>
<p><a href="http://www.technologymusings.com">Technology Musings - Thoughts about Technology and Startup&#039;s</a> <a href="http://www.technologymusings.com/whats-in-a-cloud-or-not/">What&#8217;s in a Cloud (or Not)</a> </p>]]></content:encoded>
			<wfw:commentRss>http://www.technologymusings.com/whats-in-a-cloud-or-not/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using disk
Page Caching using disk (enhanced)

Served from: www.technologymusings.com @ 2012-02-05 02:37:15 -->
