Change of Venue
Thanks very much for visiting. I am changing the location for this blog to affinitymask.wordpress.com, and all future posts after September 1, 2010, will be at this new location. Hope to see you there.
Thoughts and materials relating to IT, IT training, and life online. All rights reserved; that's what DRM is all about!
Thanks very much for visiting. I am changing the location for this blog to affinitymask.wordpress.com, and all future posts after September 1, 2010, will be at this new location. Hope to see you there.
One major consideration for a successful SharePoint deployment is knowing how much to request in the way of resources. In some cases this can be by guess and by golly. However there are some effective principles to use in planning and testing capacity. In this rich, deep presentation, Steve Peschke covers them. This is an exceptionally valuable session for anyone looking to make their SharePoint deployment a success. It’s the best kind of presentation: meat and potatoes all the way through.
What’s the goal of your testing?
Most important for most of us is RPS (Requests per Second). We might also be interested in a specific operation, in terms of the way it affects the farm or the way the farm affects it. Perhaps even just a single page, in terms of its TTLB (time to last byte). Usually capacity planning testing means verifying that an existing approach/plan is viable, or it means proving a concept.
Once we know what we want to do, we know what we need to do: what we want to measure. Most tests are based on RPS, since so many things are based on it. However TTLB is also crucial, and many tests might include both. Peschke gives the example of a farm that needs to satisfy 100 RPS, with pages loading within 5 seconds.
Also: Crawl time: how long it takes and how much material there is to index. Document indexing rate is more complicated.
Determining Throughput Requirements, or RPS
Can be complicated. It must reflect not necessarily what can theoretically be done, but what your farm’s customers need. Here he makes on of his most important points: the number of users means nothing. So, naturally enough, ascertaining this is imperative, and Peschke offers several ways:
- Historical data, from IIS logs and Log Parser, Web Trends, etc.
- Start with number of users, divide into profiles, multiply the number of users by the number of operations per user profile, and base you RPS on the peak concurrency of this.
Peschke gives an example. It’s such a great example of his approach and the merits of this presentation that I reproduce it below.
1. Contoso has 80,000 users; during any 8 hours, up to 40,000 may be at work. So we have 80,000 users, 40,000 active, and concurrency of 5% to 10%. Concurrency means active at the same time, and can be estimated.
2. Of these, 10% are light, 70% are typical, 15% are heavy, and 5% are extreme. This is a best guess.
3. Let’s say light users do 20 RPH (requests per hour). Collectively, this means 80,000 RPH.
4. Let’s say typical users do 36 RPH (requests per hour). Collectively, this means 1,008,000 RPH.
5. Let’s say heavy users do 60 RPH (requests per hour). Collectively, this means 360,000 RPH.
6. Let’s say extreme users do 120 RPH (requests per hour). Collectively, this means 240,000 RPH.
7. This means 1,688,000 RPH, or 469 RPS for these 40,000 users.
8. When we factor in peak concurrency (10%), this comes to 46.9 RPS. That’s the target we need for this farm.
Excellent, logical analysis. Now that we have some idea of throughput, we need to consider what test mixes to prepare. What activities should these test mixes reflect? Historical information, if available, helps with this; otherwise Peschke refers us to test mixes in “Planning for Software Boundary” documents on TechNet. Yet inevitably some educated guesses will be necessary.
Once we know what our throughput is, and we know what sort of transactional mix is needed, we can design a test environment. Peschke notes that few people invest the necessary time in this; he recommends two months, including three weeks in a test lab.
The test environment needs to reflect crucial infrastructure factors. Including AD: AD (How many forests and domains? How many user accounts and groups needed?); Peschke mentioned he aims for one DC for every 3-4 web front ends. How will load-balancing be implemented for this?
Hardware must also be considered. A Visual Studio Team Test Controller will be needed, several VSST agents, as well as a separate SQL server (so it does not impact the MOSS SQL server which is, after all, one of the main choke points). He also recommended turning off anti-virus software on the load test controller and its agents.
Certain configuration changes should also be made, such as stopping the timer and admin services, as well as profile imports and crawls. All pages should be published, and none should be checked out. A wide variety of pages should be included. He also points out that each Write scenario will change the content database, and this database should be restored from backup after a test run which includes writes; this assures a consistent, uniform baseline. Again, he recommends stopping anti-virus software unless you want to measure performance when using the SharePoint-integrated anti-virus capability.
Other, account-related tasks to consider include how many users and in what roles, how these will be populated, what audiences/profile imports/search content sources will be needed (if any), and whether crawl content or profiles will need to be imported for testing (he gave the example of one crawl and profile import which took three days to complete).
More on Test Design
Sample Data are a major stumbling block for many implementations. Also, the sample content should be varied, not merely numerous iterations of a single document. The search query testing results will be ridiculous. Using a backup of an existing farm is the best option for sample data. Peschke recommended tools from www.codeplex.com /sptdatapop for populating test environment data. Even so, he noted in his experience that you will almost always have to write some of your own tools for this.
Web Tests Best Practices. A mixed bag of good insights. The system behaves differently for different roles, so a variety of such roles should figure in testing; do not simply use farm admin for everything. Test with model client apps such as RSS, Outlook, etc. In addition, don’t neglect time-based operations, such as Outlook syncs. Validation rules should also be used. Test the web test itself; does it work for all your URLs, and does it work for all users? He also recommended setting parse dependent requests to no.
Load Test Best Practices. Make sure your planned test reflects a good mix of tasks. Restore from a backup before each test run, remembering to defrag the indices. Use iisreset. Remember a warm-up period, since the first test after a restore will invoke many things which are not characteristic of regular operations. He briefly discussed “think times”, mainly to say that such user behavior is almost impossible to accurately model.
Demo of a Test. Peschke next crossed his fingers and did a demo to add a web test and run a load test. What struck me most was that this testing was all done from within Visual Studio, in a very straightforward manner. Even so, it will take a few viewings, preferably with the software running on your own machine, for this to all make sense.
Questions to Consider. Good wisdom here about testing. Always assume there is a bottleneck, and ask yourself where this might be, and how it might be alleviated. And ask if the bottleneck is in an unexpected location. Is the throughput spiky? Are there many errors displaying? And be as concerned about tests which are extremely good as about tests which are extremely bad.
Post-Testing Investigation
Investigation Techniques: There were more ideas here than the standard troubleshooting ideas, and this reflected real, painful experience. Make sure the configuration is correct for hardware, the load balancer, SP settings, etc. Try running just portions of the test to divide and conquer. Simplify the farm topology. Isolate the workloads/operations/pages. Use Read scenarios rather than read/write ones. Avoid taking things for granted. This also helps illustrate why more time will be needed for testing than you might expect.
Investigating with Visual Studio Team Tests. Peschke presented a demo of this. He gave examples of table and graph views for seeking RPS and TTLB patterns and analyzing perf counters. He also gave recommendations for perf counters to focus on; this was fairly standard for machines (proc, memory, disk IO, network) and counters from VSTT default sets, adding SharePoint, Search, and Excel specific ones. One good example of this showed WFE CPU dropping while SQL CPU surged and the SQL Lock Wait Time went up, meaning that the bottleneck in this case was SQL. Peschke gave some good examples of reading test results and root-causing them.
Scaling Points. A loooong list of things to consider when scaling once capacity is understood. Too many to simply list here; I’ll include some highlights. Data files and log files on separate spindles. Does custom code need to be re-visited? For the various databases, should additional data files be added, up to one such file per proc core, and spread across multiple drives. The SQL disk design is critical for high-user or high-write implementations. Perhaps have separate farms: one just for My Sites, one just for publishing, one just for search, etc. Be sure to run on x64. In virtual environments, are VMs optimally configured? Are you monitoring the object cache and sized accordingly? Does your farm have two VLANs, one for page traffics, and one for a backend channel for interserver communication and SQL? Is there a dedicated WFE for indexing? There were more, but these seemed the most germane.
Conclusion: Exceptional granularity and depth, goes far beyond mere scenarios or demos. Worth absorbing, and worth coming back to in the future, regardless of which version of SharePoint you deploy. This is why we go to Tech-Ed.
This presentation is a good intro to the major differences between Hyper-V and VMware. That being said, the materials were more polished that the presenter. Microsoft has a long history of annihilating competitors, and can enter almost any market and immediately become a dominant player. Yet in this presentation, Jason Fulenchek seemed quite tentative, and reluctant to go toe to toe with VMware. This is quite different from some presentations last year and much less go-for-the-throat than some others. In fact, I was sitting next to a VMware rep who was texting jibes back and forth with the other VMware folks in the audience. Microsoft’s case against VMware can be very strong, but you could not see it very clearly in this presentation. Parenthetically, it was made more technically in Armstrong’s “Hyper-V and Dynamic Memory” presentation.
This is a 200-level presentation, so the presentation materials were quite polished, and seemed based on ones from technical marketing or sales engineering. The price for this was not much time spent at a very high level.
The strategy seems to have been to emphasize the wide and broad nature of Microsoft’s platform environments, naturally looking to contrast this with VMware’s much more confined view of platforms.
Part One: Details, Details
Main benefits: familiarity, cost, ease of implementation, and consistency. Important also is the system monitoring capability for VMs: integrated, and offering intra-VM monitoring and optimization, with similar monitoring capabilities for applications and services as well. I would add the large ecosystem of community for MS products in general and superior documentation.
Unified Management. Much time went into this. Points made were the emphasis by users on applications, not VMs, an emphasis said to be more effectively made through Microsoft’s approach. Most interesting here was the availability of “in-guest management and monitoring”, allowing you to integrate virtualization into existing processes, reducing start-up and operating costs. MS VM Manager 2008 R2 is completely integrated into System Center Operations Manager 2007 R2. This allows VMs to be monitored and managed just like other machines, and also allows superior monitoring of applications, processes, and services.
At this point he launched into a detailed, list-based comparison of vSphere (various editions) and Hyper-V in terms of features and costs. This was interesting in principle, and worth reviewing at one’s leisure, but unsuitable for a presentation. This is not to say that there’s no value here; it’s just that you have to look and listen pretty hard to get a clean handle on what the key differences are. The presentation, for an IT pro crowd, would have made more sense if it had focused with more depth on a few key themes.
Part Two: Roger Johnson from Crutchfield Corp.
This began with a video explaining the benefits of MS virtualization for Crutchfield Corporation, a consumer electronics retailer. They had started out as a VMware deployment, but grew displeased with the perceived high cost, and reconsidered. Eventually they switched to Hyper-V and related management technologies, saving about half a million dollars by doing so. Their corporate datacenter now runs as 77% virtualized (not bad), with 5 virtual hosts and an average VM density of 45:1, including all dev and test environments. Johnson described substantial savings versus VMware, on the scale of 3:1.
Part Three: Duking It Out
To the extent the gloves come off, it does not happen until 50 minutes into the presentation. In a slide titled “Responding to FUD”, Fulenchek invests ten (10) minutes in busting some myths.
Fulencheck followed this with some demos, which were reasonably successful, and showed the impressive resources available for managing VMs, in particularly (for me), being able to manage apps running on particular servers from within Systems Center Operations Manager 2007. This can be contrasted with the “black-box” (as another MS presenter described it) approach used by VMware in VM management. Other features he touched on seemed less revolutionary: “storage migration” which parallels storage VMotion, and volume re-sizes which can be done without needing storage VMotion at all.
So there is definitely a case to be made that Microsoft’s approach to virtualization is quite competitive with VMware, not least because anyone with Server 2008 already has everything needed to try this out. However despite the strength of this case, the presentation lacked force. More focus on these key points of comparison would have made this more effective. So would having done more with the Crutchfield material, which was presented and then hardly utilized at all. Oddly, the strongest case for Microsoft against VMware was made in other presentations.
Broadly speaking, there are two types of presenters at Tech-Ed. There are the meticulously outlined and powerpointed, clearly rehearsed presentations. Then there are the seemingly wing-it presentations, where the presenter has few slides, and often laughs it off, preferring to cram demos into the session. The former goes by practice, the latter by passion. Mark Minasi or Jeff Woolsey exemplify the former, and some of Steve Riley’s later presentations represent the latter.
The difficult thing is, six months after the presentation, the video and the powerpoint are usually all we have to rely on. And if all you have is an 80-minute presentation, it’s unnecessarily difficult to find what you need for your work, and you often have to go almost frame by frame to see the details. Maybe you enjoy slow-motion spelunking. Maybe I don’t.
Laura Chappell’s presentation, ostensibly on corporate espionage splits into two halves. The first was narrative, the second was technical. Throughout she tossed in insights, names, sites, and suggestions.
The first part was comprised of five case studies. The first one was extremely illuminating; all were worth hearing:
1. A company had outsourced some development work to
2. A company planned to release a new cell phone product, expecting it to be a cash cow for the next few years. They decided to outsource some production to
3. Outplacement/Separation. Some types of firewalls are verbose, and some are silent. Verbose firewalls in effect inform people their access has been blocked, which enables some to find circumventions. Laura gave the example of a company which discovered almost $200,000 had been siphoned off through a remote office which had been closed (management was not aware of this). The most successful such attacks take place before major holidays.
4. Lost Prototype. IPhone 4 prototype lost in a public place, was retrieved, and sold. Familiar story, with good discussion of the efforts made to carry out and thwart such attempts to steal products.
5. Blabla and Stephen Watt. Involved the theft of 170 million credit/debit card numbers. Watt designed a program that by itself accessed at least 45 million credit and debit card numbers. Other partners were also bought off and brought in.
There was some good insight here. For instance, she mentioned business aspects of breaches, from stuff like fireproof safes to how to interact with law enforcement, how to get in close with law enforcement in HTCIA (and its great value), and valuable products for host forensics (Access Data with their Forensic Toolkit and Guidance Software with Encase). Another example was a recommendation to search for “cybercrime DOJ forensics” and to look for an excellent DOJ paper for first responders.
The second half was non-narrative and more technical, built around a series of network traffic traces. Oddly, it seemed very impressionistic; watching her muse in Wireshark over which capture file to pull up made me wonder how much advance thought and planning had gone into this presentation. But I digress.
Chappell showed a site listing credit card numbers for sale, with related info (security codes, etc.). Next she showed Wireshark in action, taking a sample trace file available from the Wireshark book site. Also discussed NMAP (network scanning and discovery product); ZenMAP, a graphical add-on, lets you traceroute to the found locations and see the relations of devices on the internet. These are the kind of resources used for reconnaissance. Much practical advice on social engineering and traffic analysis techniques in general and Wireshark. For example, even handshake refusals are significant, since they frequently identify chatty firewalls. Chappell gave a good description of using taps to capture traffic for analysis, the logistics involved, what she looks for, and characteristics of troublesome packets.
The second half of the presentation seemed driven by the samples chosen on-the-fly, rather than having samples selected to illustrate planned, organized ideas. That’s not to say there was no insight or value. But it did seem wandering and unstructured. Re-listening and trying to outline her presentation, I was often left scratching my head. That being said, Chappell consistently gets high ratings for her Tech-Ed presentations; I’ll have to listen again to see what I’m missing.
Kai Axford’s security talks are consistently well-attended, and he has been presenting on security for a long time. This year he teamed with Allyn Lynd, an FBI special agent. Their talk was long on stories and, truth be told, short on insight relevant to the corporate IT security world, as we will see.
Let’s look at the case studies briefly.
Case Study #1: Phone Phreakers. Lynd does a good job of explaining the defective mentality of people who pull these pranks, and the trivial, trite application of occasional technical savviness. The pranks and scams were ingenious, but pointless. The most egregious was an example of “SWATTing”, an attempt to trick an emergency responder to dispatch a team when no emergency exists, usually through spoofing a phone number. Obviously this can lead to confusion, misunderstanding, loss, and injury. Interesting, but I’m not sure why this is central to a presentation to IT professionals.
Case Study #2: Trusted insider case, someone working the night shift at a hospital. Using his access card, he interfered with hospital HVAC operations, and publicly bragged about it, going so far as to post a youtube video about it (nonetheless, he denied doing it when arrested). Entertaining, but also a few lessons here: carefully screen even low-level staff, and audit, because there’s no such thing as trivial physical access to servers.
Lynd tangentially made the point that social network sites are major sources of risk for penetration and intelligence-gathering, and FBI agents do not create them. Identity theft and theft of trade secrets were also covered, albeit quite briefly with little depth. Then nearly twenty minutes for Q&A.
So afterwards, I was impressed by the intricacy and depth of malice out there, and obviously as a society, we have to wonder what’s going on with our culture that we’re producing sociopaths like the ones featured here. That being said, the relevance of the case study material for organizational IT was not all that strong, and at the end, it is difficult to see why some of this material was chosen. Clearer focus on the likely needs of the audience would have made this a more enlightening presentation.
Not so long ago I reviewed Mark Minasi’s Tech-Ed 2008 IPV6 presentation. Frankly, it’s hard to imagine anything better, in terms of style and suppleness. Nonetheless, he does not deliver it every year, and others have come forward more recently to explain this technology. This year, Swiss maven Marc Michault presented a survey of IPV6 which covered much of the same ground, but with added depth on some aspects, making this a worthy session to summarize and review.
Good overview of the packet construction. The header for the packet has been redesigned, and now weighs in at 40 bytes, mainly source and destination addresses; this simplifies the work of routing. The addresses are written in hex, rather than the standard 10-notation of IPV4. They also are rich in leading zeroes, all of which can be unwritten. Thus,
FD00:0000:0000:0021:0001:0000:0000:5143 becomes
FD00:0:0:21:0:0:5143, which can be further condensed to
FD00::21:1:0:0:5143. Note that this distillation of :0:0: to :: can only be done once.
Subnet Masks are also quite different, and come in at 16 bits (network ID is 48, and interface ID (local name)) is 64. This allows some 65,000 subnets. For most purposes this is enough.
Addresses are also different in terms of their number, in that a given node, or NIC, can have several IPV6 addresses.
Address Type #1. The Link Local, which always starts with FE80, is self-generated, just like current APIPA addresses. This is used, among other things, for router discovery. Now even though the number of available identifiers is truly vast, some have fretted that they might not be enough, and have proposed a way to in effect recycle them. This involves adding to the address a percentage sign (%), and then appending the name of the link/subnet on which this address resides. You know, just in case.
Address Type #2. Global Unicast Addresses are routable and can be used for global communications. The first three bits are set to give us values of “2” or “3”. Here, three bits are used to identify the type of address (currently just “2” or “3” is used) and then 45 bits provide the network ID, and are designed for global routing.
Address Type #3. Unique Local IPV6 Unicast Addresses are labeled as FD. These are not used for global routing, and are indeed for intra-organizational routing. Formerly known as link-local.
Address Type #4. Multicast addresses only work for link-local operations; broadcasting in IPV6 cannot be done. For this purpose, names start with FF. Here are some examples of the most common multicast addresses:
FF01::1 Interface-Local All Nodes
FF02::1 Link-Local All Nodes
FF01::2 Interface-Local All Routers
FF02::2 Link-Local All Routers
FF05::2 Site Local All Routers
Solicited Node, Link-Layer Multicast Address. This part was less successful, and after several re-listenings it remains unclear to me.
2. Auto-Configuration
Two types: stateless and stateful. The former is autopilot; the latter requires some upstream preparation and manual setting. Stateless involves link-local neighbor discovery, and router advertisement. Stateful continues to rely on DHCP for settings which cannot be provided by routers (time servers, DNS servers, etc.).
He makes points similar to those in Minasi’s presentation, including the problems of using MAC addresses in IPV6 addresses. Instead, machines self-generate addresses listed as “tentative’” until DAD has confirmed no duplicate exists. Then it is confirmed as “valid”.
Once the machine has identified itself and the other nodes on its segment, it next seeks to see beyond that segment. It looks for routers, with discovery messages sent in multicast; of course only routers respond. The router response gives the machine a prefix for that router’s network, and the machine appends this to its own segment, thus creating a globally routable address.
That’s the easy, stateless style of auto-configuration. Needless to say, the “stateful” style takes more effort. The addresses can be done manually. This is seldom desirable. However DHCP is also an option, allowing many, many settings to be provided, such as domain suffix, time servers, etc. For most corporate networks, this will remain in use for the foreseeable future. It’s only natural the some day soon, routers will be marketed that will provide these DHCP functions.
Michault does not mention this, but it is worth noting that at this point the machine may have four addresses: the initial self-generated one, the one used within its segment, and the one(s) provided by its router.
III. Name Resolution
Based on the nature of name acquisition, one would think that this should be straightforward. One would be wrong. Michault found three points to make:
1. Local subnet, with Link-Local Multicast Name Resolution (LLMNR). As of mid-2010, the details for DNS-related multicasts are still being worked out. In this case, hosts can contact the multicast address and the message will be forwarded to each machine on the segment, who will in turn respond. This also replaces the browser service.
2. Internet, with Peer Name Resolution Protocol (PNRP). LLMNR is okay for identifying and contacting neighbors. More is needed for large scales of communication. This is also used in non-IPV6 networks. This hashes the host’s machine name and defines a proximity based on this. Now, each node (computer) maintains hashes of neighborhood machine names. These are then reviewed. Successively closer and closer nodes are contacted, eventually (it is hoped) yielding a match.
3. Internet, with DNSV6. Few major differences. Though since addresses are much longer, the A record is replaced by an AAAA record. In addition, since reverse notation is in hex, the PTRs can become quite long. Other than this, there are few changes.
Michault summarized by discussing some current technologies of note which run on IPV6. DirectAccess was the foremost example here.
Michault finally reviewed some important transition technologies for making this work.
Tunneling. Like other types of tunneling, this approach takes IPV6b packets, breaks them up, and gives them IPV4v headers. Normally this is done for encryption, however in this case there is no added security.
ISATAP, or Inter-Site Automatic Tunnel Addressing Protocol. This provides support for IPV6 applications running on an IPV4 network. This is done by adding in effect an IPV6 prefix to an IPV4 address. For private networks, this prefix would be ::0:5EFE; for a public network this would be ::200:5EFE. Michault gave a good explanation of how this works for transporting IPV6 traffic through an IPV4 network, and also for how this would work in a mixed network of IPV4 and IPV6 segments.
6to4. This is intended for IPV6 organizations that must continue to interact with IPV4 ones. It works by viewing the IPV4 network as the 2002:WWXX:YYZZ:: range and then having a gateway turn the IPV4 address into a hex value, which receives a prefix from the gateway. This prefix-plus-hash is then represented to the routers in the organization.
Teredo. “a pain and a bit of a bitch”, to quote Michault. This allows IPV6 hosts to communicate through IPV4 NATs. This transitional technology should soon fade. In this system, hosts get Teredo addresses which include the assigned Teredo address range (2001::/32), the public IPV4 address of their Teredo server, and the obscured public IPV4 address and port for Teredo traffic on their NAT. Communication runs through a complicated process, including NAT1 to NAT2, then NAT2 to Teredo Server 2, and then Teredo Server 2 to Target Host2, followed by Target Host2 to NAT2 to NAT1 to Target Host1. Michault referred the curious to whitepapers and Joseph Davies’ second edition of Understanding IPV6. Perhaps the most reassuring lesson here is that it’s a transitional technology, not intended for permanent presence.
Conclusion: Few are willing to quickly and fully embrace IPV6, whatever its benefits. Yet its strengths are compelling, and most importantly, the world will soon run out of IPV4s. This will be the IP backbone for the rest of our lives, and it behooves us to understand it. Michault’s presentation was valiant, especially for a second language presenter. The pacing, to me, seemed weak. The arcane was presented at times with too little explanation and sometimes, frankly, seemed unnecessary. Explanations were not always strong, though the speaker was articulate, intelligent, and focused. That being said, the explanations sometimes could have been clearer, and the scope for this talk was perhaps too broad. In retrospect, I’d have recommended a shorter list of topics, covered in more depth.
Post-Script: In an age where we have enough IPV6 addresses for every hair on your head to be uniquely identified, it’s inevitable that some day our coffee-makers, credit cards, and pets will have their own unique addresses. Additionally, now that addresses have six letters as well as ten digits, it’s inevitable that people will look for lucky numbers or ways to include messages in their addresses. Rude suggestions to the FBI, DOD, CIA, for example, could be built into addresses. More reflection on these likely futures would be worthwhile.