fmII
Tue, Jan 06th home | browse | articles | contact | chat | submit | faq | newsletter | about | stats | scoop 01:51 UTC
in
Section
login «
register «
recover password «
[Article] add comment [Article]

 Lights-Out Administration
 by Rusty Lingenfelter, in Editorials - Sat, Dec 22nd 2001 00:00 UTC

Current network and systems administration tools offer engineers a wide range of capabilities for remote administration. One capability that is limited is the capability to remotely power cycle a server or network device and perform remote diagnostics on any machine that will not boot. This paper will outline the requirements for a set of industry standard devices capable of performing remote functions on servers and network devices, targeted toward a common situation faced by network and systems administrators.


Copyright notice: All reader-contributed material on freshmeat.net is the property and responsibility of its author; for reprint rights, please contact the author directly.

As more Information Technology departments centralize and consolidate to reduce cost, many remote sites are left with no on-site IT support. Remote administration of computers is increasingly common because of the significant cost benefits; many tasks can be automated, and the administrator does not have to physically visit each computer (CERT, 2000). In their whitepaper on Remote Systems Administration, Stephen Packard and Archie Andrews stated that remote systems administration is a reasonable, economical approach. Also, as networks and servers become critical to nearly all business functions, more IT departments are staffing or providing for some type of round-the-clock monitoring and support. While 7 by 24 support is a great capability, it is limited by the ability to gain physical access to off-site network devices and servers when they lock up or cannot be accessed in-band. Even when the problem occurs during business hours, the lack of on-site IT staff may require an unskilled user to work with the remote IT staff to correct the problem. This is not a good use of the unskilled user's time and may turn a small problem into a large one. The type of devices discussed in this paper would allow network and systems administrators to access functions and information that would normally require physical access to the equipment.

The state-of-the-art in remote systems and network device administration includes graphical tools that allow the simultaneous remote monitoring, configuration, and management of a large number of devices. Uninterruptable power supplies can be connected to the network (typically in-band), which allows significant monitoring and a cold boot (i.e., power off/power on). At least one PC server vendor offers a board that allows remote power cycling and limited administration on a machine that will not boot. With these capabilities, why is another class of remote management tool required?

Toolsets for remote configuration and diagnostics tend to be vendor-specific and work with only a subset of a vendor's products. Unless your operating environment contains a limited set of equipment from a single vendor, remote administration will likely require multiple software products. Further, each of these tools tends to be expensive and require training and experience to use properly. The problem with tools that rely on in-band communication is that a frozen device often will not be reachable in-band. Also, these in-band solutions create network overhead that competes with customers for available bandwidth. While smart UPS devices are useful, a simple cold-boot capability with no diagnostic capability is of limited value. Compaq offers the Remote Insight Board (and a Lights-Out Edition) that provides some of the capabilities that I will describe below. This is a great tool which provides significant capability, but it only works with a limited set of Compaq servers. My proposal is for a more limited feature set with greater out-of-band communications capability and the ability to work with any PCI- or PCMCIA-compliant device.

This paper recommends the creation of an open standard for a remote control and diagnostic device that includes an open standard MIB (Management Information Base). A standard MIB would allow the device to report status to the large variety of SNMP (Simple Network Management Protocol) applications currently available. This standard should also define an API that would allow applications programmers to access all of the features and capabilities of the device. The standard and MIB could be created by a Request for Comments to the Internet Engineering Task Force. The creation of these standards would allow hardware and software vendors equal access to the market. It could significantly affect the cost of network and systems management by eliminating multiple applications required for administration.

The potential features of the device are nearly unlimited, but the minimum features required include:

  • Support for a variety of in- and out-of-band communications which would include (but not be limited to):
    • 100baseT (RJ45),
    • Plain Old Telephone Service (RJ11),
    • Cellular Digital Packet Data,
    • and 802.11b.
  • The ability to be managed by SNMP through a standard MIB.
  • The ability to access device features and data via a Web browser or directly via an open API with a standard data specification and access method.
  • Support for full DHCP or static entries.
  • Support for SSL.
  • An EEPROM socket for a user-customized ROM.
  • User- and roles-based security with appropriate logging.
  • System event logging.
  • Support for warm and cold device rebooting.
  • Support for POST (Power-On Self-Test) reporting via SNMP (MIB) and as events written to the system log.
  • On-board battery backup for sustained functions for at least 30 minutes.

These minimum features would address the most common problem for systems and network administrators: rebooting a frozen server or device. They would also allow diagnosis, to facilitate a decision about whether or not to dispatch repair service.

The standard should also provide for full capture and reporting of data from SMART (Self-Monitoring, Analysis, and Reporting Technology) hard drives and from devices and systems that use ACPI (Advanced Configuration and Power Interface) and other related standards. It should provide sufficient capacity for future growth and interface with other new devices as those standards are accepted.

One difference between this device and the devices currently available is the variety of methods for connections. Particularly useful is the ability to connect to the device using a wireless connection. As most large organizations use some type of PBX (Phone Branch Exchange), POTS lines are limited. Server and communication rooms are already crowded with cables. CDPD connectivity would provide a low-cost method for ready access to the device when in-band signaling is not available.

The device should be implemented in two versions. Each device would have common features but would vary by physical interface and form factor. The first device would use the industry standard PCI interface on a half-length card, to facilitate compatibility with the widest range of products. The second device would use the industry standard PCMCIA interface and Type II form factor. These two devices could provide interfaces to networking devices, servers, and appliance devices.

In conclusion, this paper presented a concept for a standard specification for a device to facilitate remote management of servers and network devices. While the communications capabilities of the device described would be significant, its most important feature would be that it was built around an open standard available to all manufacturers and applications developers. This would lead to significant cost savings for IT managers by reducing software, software maintenance, and staff training costs. It would facilitate efficient management by significantly reducing the number of applications required to effectively manage a network and networked systems. It would allow greater integration with intelligent network management systems for automated response to outages. While this paper described add-on devices, the technology could be integrated into servers and network devices as a value-added feature. I anticipate that these devices would add no more than $500 to the cost of a system. This cost pales in comparison to the hard and soft costs of remote administration described throughout this paper.

Bibliography

Bradner, S. (1996)
RFC 2026
CERT (2000)
Configure computers for secure remote administration
Packard, Stephen L. and Andrews, Archie D. (2000)
Remote System Administration


Author's bio:

Rusty Lingenfelter is currently at student at East Carolina University, enrolled in their Masters program in Industrial Technology Digital Communications. He is employed by the Iowa Army National Guard as a Battalion Commander and Training Officer. He was previously employed at the National Guard Bureau, where he served as the Chief of Communication Operations and Chief of Systems Engineering. He has worked in a variety of engineering and operations positions for the past 9 years. He lives in rural Iowa with his wife and three children.


T-Shirts and Fame!

We're eager to find people interested in writing articles on software-related topics. We're flexible on length, style, and topic, so long as you know what you're talking about and back up your opinions with facts. Anyone who writes an article gets a t-shirt from ThinkGeek in addition to 15 minutes of fame. If you think you'd like to try your hand at it, let jeff.covey@freshmeat.net know what you'd like to write about.

[Comments are disabled]

 Referenced categories

Topic :: System :: Hardware
Topic :: System :: Monitoring
Topic :: System :: Networking :: Monitoring
Topic :: System :: Power (UPS)
Topic :: System :: Systems Administration

 Comments

[»] Author's Response 1
by Ling - Jan 3rd 2002 12:49:15

Some great input on some great devices. I am not going to critique each device or product, but suffice it to say that none of them address all of the requirements outlined. My focus is not on a specific device, but on an open standard, a published MIB and out-of-band capabilities. So far the discussion has focused on specific devices or software that really exemplify the problem (i.e. vendor-specific and in-band) that an open standard could address. The only one that really heads down the path that I outline is the RealWeasel board. While it does not immediately address many of the requirements that I outlined, the fact that it is based on an open specification certainly sets the stage for accessing it from within any vendors network management solution (i.e. Openview, Unicenter ) which is a key tenet of my paper.

While sharing solutions is valuable to the readers, I would also be interested in your thoughts on the requirements for the perfect solution for remote server and network management. Make the assumption that it does not exist as I am pretty confident that it does not. If you were king (or queen) for a day and could write the open standard, what capabilities would the specification include. My brain is probably limited to current technologies and I had difficulty getting out of that box. I shared this with Freshmeat because I wanted feedback from SA's and NM's and that is obviously what is happening. Thanks for your interest.

[reply] [top]


[»] Seasons Greetings - CIM is free.
by be - Dec 22nd 2001 22:48:31

It worries me that anyone would suggest increasing the cost of a server by ~$500 for a tool that should never be required if the server is reliable and the sysadmin does his/her job properly. Personally I'd rather see my wages go up by $500 per server! :)

Compaq Insight Manager has been around for years, is free and supports Linux as well as the others OS's. (Other server builders would do well to develop agents for CIM instead of TNG)

Nevertheless I'm still trying to think of a situation (other than a floppy or bootable CD left in it) where if a server fails to boot, sysadmin attendance would not be required. Most server builders use the security of their systems as a selling point, a tool that provides an avenue for remote unattended access to the hardware whilst the os is inoperable sounds dangerous. The other thing that would worry me is would the server continue to function if the device itself failed. Of the 150 servers I have managed over the past 2 years none less than four years old have failed because of hardware problems.

A reliable supplier of spare parts sounds like a safer bet.

[reply] [top]


[»] Remote control
by freshmeat@thewrittenword.com - Dec 22nd 2001 19:59:11

http://www.realweasel.com
http://www.apcc.com/products/masterswitch/index.cfm

Great combination!

[reply] [top]


[»] hrmmm... :)
by gafami - Dec 22nd 2001 12:32:40

i am not 100% sure if this device is what has been meant in the article.. but we use one of those:

http://www.avocent.de/enterp_loesung.asp


pretty neat solution for remote and local management of multiple servers...

--
startupx.com - the future of start-up financing

[reply] [top]


[»] It is there already
by Peer Oliver Schmidt - Dec 22nd 2001 04:03:08

A device to remotely power cycle a machine via IP network already exist. The german distributor allnet (http://www.allnet.de) sells such a thing.

You are able to remotely power on/off up to 8 devices from remote.

[reply] [top]


    [»] Hum HP TopTools
    by Francois Harvey - Dec 23rd 2001 08:11:23

    A card with a lot of option (bios access, dmi browser, snmp, server status, etc.) exist it TopTools card, some of my customer use this for critical server

    www.hp.com/toptools/

    (free version of toptools software exist, and the card cost some money)

    [reply] [top]




© Copyright 2008 SourceForge, Inc., All Rights Reserved.
About freshmeat.net •  Privacy Statement •  Terms of Use •  Trademark Guidelines •  Advertise •  Contact Us • 
ThinkGeek •  Slashdot  •  Linux.com •  SourceForge.net  •  Jobs