Adding RCS to the System Administrative Toolbox

by Nick Christenson
npc@jetcafe.org
July 13, 1998

Introduction

Every System Administrator has run across data in system administrative files which they have no recollection of adding, or any idea why they are there. For example, it can sometimes be hard to know who added a particular account to a particular system, and then to determine when it was added, and why. Similarly, it's likely that one has added lines to a file, for example the sendmail.cf file, and then at a later date it is completely forgotten why they are necessary, when they were added, and what, if anything, they might have replaced.

Furthermore, just about everyone has made changes to a configuration file, then decided they wanted to undo them, but couldn't remember exactly how the file looked before. On top of this, most every SA has, on occasion, forgotten to make a backup copy of important files before making significant changes. Perhaps they didn't mean to make drastic changes to begin with, or maybe the file was saved or altered by a keystroke made in error, but either way it would be nice to go back to a version of the file that now exists only as a memory.

While all these problems could be resolved by any one of a number of methods, our friends the programmers have for years have had access to a number of tools that are designed to track changes to their source code files. These tools are called version control systems, and they should be as familiar to SAs as they are to programmers. In fact, the use of version control systems extends beyond both of these applications to other kinds of files. Papers, correspondence, data, configuration files, etc. are all documents to which revision control systems can be applied with great effectiveness. In fact, it is my opinion that absolutely every text document that undergoes modification and is organized line by line ought to be under some sort of revision control system. This article will explain the fundamentals of one of these systems, RCS, as it can be applied to System Administration tasks.

What is version control?

A version control system is a suite of software designed to help manage changes to a given text oriented document or set of documents. Most of these systems were designed with the management of software source code in mind, but version control systems are far more versatile than this would suggest. Although a large share of most of these systems' feature sets will typically go unused when not used for source code management by a team of programmers, the basic requirements for all line oriented text documents are the same. We need some way to manage changes, provide comments on revisions, compare versions, determine when changes were made, and extract arbitrary revisions. This is true whether the document is a piece of source code that may be modified by several programmers, the /etc/passwd file that may be modified by one or more system administrators, HTML documents on a web site that may be modified by one or more people, or other forms of documentation, like papers, letters, etc.. In fact, this article and its outline are being managed under RCS as it goes through various stages of development. Some of the extended capabilities, such as document merging and revision branches may not be needed in some of these applications, but that doesn't interfere with the tool's applicability.

Which version control system should I use?

The short answer to this question is that one should use whichever version control system is most handy. If a specific system is already familiar from programming or some other task, it probably can easily be adapted to system administration uses, and given this, there's no good reason to learn another tool.

Alternatively, if one is not exceedingly familiar with any one version control system, but there are programmers or documenters for whom one provides SA support who do use a specific system, then that system would be a logical choice. First, one wants to use a system that will be installed everywhere, and any system that is already in widespread use, is probalby present on a number of systems, and it's likely that reasonably well debugged install packages for it exist. Why make more work? Second, if there is one package that is already in popular use at one's site, then it's more likely that if, through lack of experience, one does something to an important file that one doesn't know how to undo, sufficient expertise will be available to help fix the problem with minimal impact. The relative benefit of having in-house expertise on a given system is almost certainly sufficient to justify making it the front runner for adoption to other tasks. Third, by becoming familiar with a package that is in widespread use by one's customer base, the SA team can eventually assist in education and troubleshooting of customer problems with the same package. Certainly, the SAs will be asked plenty of questions about it, if they are not expected to provide support for it outright.

If multiple or no version control systems are in widespread use at one's site, then one will need to be selected by some other mechanism. However, it's very easy to overanalyze the decision of which version control system to adopt. In fact, the only really wrong decision is the decision not to adopt one. Just pick one and go with it. If it turns out that one wants to switch later, this is not terribly difficult as there are tools which can help with this.

In this article, I will deal exclusively with RCS as the version control system of choice. I have chosen to use RCS for several reasons. First, it is very widely used; second, it is widely available; third, the documentation for it is outstanding, fourth, because changes in the file are recorded as differences against the most recent version, so that if for some reason every RCS binary in the world suddenly stopped working, it would be straightforward to extract the most recent version of the file by hand; but mostly, because it is what I have gotten used to using.

What do I really need to know?

When a file is checked in to RCS, its format is changed, and a ",v" (that's a comma plus a "v") is appended to the name of the file. When a file is checked out of RCS, it reappears in its original state. If there is a directory called "RCS" in the same directory as the file, the ",v" file is placed within the RCS directory in order to avoid cluttering up the directory itself. The format of the RCS file is still all readable text, at least if the file that was checked in was, but its contents are only intended to be accessed through the programs that make up the RCS system.

The "RCS" directory is technically optional, but I strongly recommend its use in all cases. In the first place, it's helpful to not clutter up directories like /etc with additional files, but it also allows one to restrict general access to the comments and changes made to these files by controlling the access permissions of the directory itself. Running chmod 700 RCS and making these directories owned by root everywhere keeps this information under wraps.

ci -- check in

The command "ci" checks a document in to version control. Running "ci file" will cause an RCS file to be created for the document in question if it does not exist, and the user is prompted for a description of the file. This is of more use in a programming environment where the purpose of the file might not be immediately clear by its name. In an SA context, it's probably not necessary to enter a description of the sendmail.cf file, for example. Once the description has been entered, type a period or EOF character alone on a line to be returned to the command prompt.

If one just types "ci file", the document in question will be removed when the RCS document, "file,v", is created or updated. In a programming environment, this is fine, but for a system file, this can be disastrous. Having the passwd file suddenly disappear is not desirable. It is imperative that this file exist in /etc and be readable at all times. Performing a "ci -u file" or "ci -l" will retain a copy of the file in the current directory. In the former case, the copy will be unwriteable and no new RCS lock, which is a field in the RCS file that records the username of the person who intends to modify the file, will be present. In the latter case, the file will be writeable (with whatever permissions the file had; this information is saved within the RCS file) and an RCS lock will be recorded.

When performing a "ci" on a file that is already in version control, the user will be asked to enter a log message. This should be a brief description of what the nature of the change was, and who performed it. The date and time of the change will automatically be recorded. For example, if I add a new account for Fred to the passwd file, I might enter the log message as:

Created account for user fred.
npc
.

As with the file description, a "." or "^D" alone on a line will terminate the log message.

co --- check out

As one might expect, "co" stands for "check-out" and is used for extracting versions from RCS control. Issuing a "co file" will extract the most recent revision of file read only. The "co -l file" command will lock the file and will place a writeable copy in the working directory. The result of either a "ci file" or "ci -u file" followed by a "co -l file" is equivalent to issuing a "ci -l file". Also, issuing "ci file; co -u file" will produce the same result as running "ci -u file".

If one is using RCS properly, one will only edit files that one has locked themselves, with no exceptions. The reason for this is to prevent two people or processes from editing the same file at the same time, which may very well result in one clobbering the others' changes. Since only one entity can obtain a lock at any one time, the user knows whether it is safe to make changes or not. If one checks the file out and locks it, it is safe to edit. If a lock cannot be obtained, then someone else may be modifying that file.

This can be especially useful if automated processes could be editing the same files as human beings. For example, assume there exists a process that automatically updates the /etc/passwd file on a machine with new accounts driven by some external process. If I want to make a manual correction to that file, I need to make sure that this process does not try to update the file at the same time that I do. If we're both using RCS, both I and the process can try to obtain a lock on the file with a "co -l passwd" on that file. If it succeeds, either of us can safely modify the file and "ci -u passwd" to check in the changes we made and unlock the file for others to access. If either of us get an error message saying "writeable passwd, remove it [ny](n): " or "is already locked by npc." then we know that someone else is accessing the file, and we should wait some period of time and try again.

One problem that can occur with this scheme is that someone may forget to unlock a file once they are done editing it. While locking and unlocking files properly is definitely the better procedural way to do things, sometimes, in order to get things done, one may have to be a bit more pragmatic. In these circumstances if only one user is editing the file, and in the SA case this tends to be root, an SA team might make the decision that it is better to always leave the file checked out and locked, and, hence, writeable, after changes have been made. This way, if someone comes along and makes other changes, without going through the RCS procedure, their modifications will still get made. This can be done by merely using "ci -l file" before every edit to make sure one's predecessor didn't forget this step, and after each edit to record one's changes. However, a team doing this runs the chance that multiple entities may edit the same file at the same time since one can't use the locking system to guarantee that one has exclusive access to the file.

There are a number of additional considerations that will be discussed in more detail later if one wants to adopt either of these schemes. Deciding whether to adopt a loose or strict locking policy is probably the hardest decision one has to make when adopting RCS. Fortunately, both systems are reasonable choices, and it's straightforward to switch from one style to another if desired.

RCS

The rcs command is a catch-all command. It primarily manipulates the RCS file without changing the actual contents of the file. For example, if someone accidentally leaves a file locked by themselves after they've left for the day, the command "rcs -u" will remove their lock. Similarly, an "rcs -l" will lock the most recent revision of the file. The "rcs" command can perform a number of other manipulations on the file as well. As with all the other commands, there are a lot more less commonly used options available. Consult the man page for more information.

rlog

The "rlog" command is used to look at what log information has been entered when each revision is checked in. In addition to whatever useful messages folks might have left, it also states the name of the user who checked in each revision. Since this is the audit trail that one will want to use to determine what changes were made when, it is crucial to leave meaningful messages. Something like, "made some changes, --npc", isn't going to do anyone any good. When entering a log message, stop and think about whether or not someone in the future will understand what was done based upon that message. If not, maybe another message would be more appropriate.

RCS supports the ability to inject identifiers into the document which may be helpful later. For example, I might put a " $Id$ ", which is the most comprehensive, and for SA applications, probably the most useful identifier available, into my sendmail.cf file. As the document is checked out, the RCS commands know about these identifiers and substitute the information for the identifier. For example, if I put the following line in the sendmail.cf:

# $Id$

and run "co -l sendmail.cf", it will be replaced with something like:

# $Id: rcs4sa.html,v 1.6 1998/05/25 21:12:59 npc Exp npc $

which contains the date and time (UTC) that it was modified, who modified the file, its name, etc..

One really nice thing about using these identifiers is that RCS has a pretty good grasp of how comments work for most all types of documents and handles them gracefully. In the case above, the "#" is the comment indicator for the sendmail.cf file. RCS help make sure that whatever replaces the identifier is commented as well, even if the identifier is replaced with text that spans multiple lines. Note, that one should not put identifiers in files that do not understand the notion of comments, such as /etc/passwd, /etc/hosts.equiv, etc. because these files. For example, if the above expansion of the " $Id$ " identifier were placed in one's hosts.equiv file, it may very well allow the user "$Id:" free access to one's system if they are coming in from the machine named "#". In any case, there are lots of delimiters available, and the rcsintro man page provides a good overview of them.

rcsdiff

The rcsdiff command is used to identify the differences between revisions of an RCS document. The output should be quite familiar to those who have used the Unix diff command, which is used by RCS to handle the differences between the revisions. A typical usage might be as follows:

# rcsdiff -r 1.2 -r 1.1 inetd.conf
===================================================================
RCS file: RCS/inetd.conf,v
retrieving revision 1.2
retrieving revision 1.1
diff -r1.2 -r1.1
6,9c6,9
< #ftp  stream  tcp     nowait  root    /usr/libexec/ftpd       ftpd -l
< #telnet       stream  tcp     nowait  root    /usr/libexec/telnetd    telnetd
< #shell        stream  tcp     nowait  root    /usr/libexec/rshd       rshd
< #login        stream  tcp     nowait  root    /usr/libexec/rlogind    rlogind
---
> ftp   stream  tcp     nowait  root    /usr/libexec/ftpd       ftpd -l
> telnet        stream  tcp     nowait  root    /usr/libexec/telnetd    telnetd
> shell stream  tcp     nowait  root    /usr/libexec/rshd       rshd
> login stream  tcp     nowait  root    /usr/libexec/rlogind    rlogind

This example indicates that the four services listed were commented out between revision 1.1 and 1.2. Again, a lot of flags and options are available, consult the man page for more information.

Pitfalls

Locking

RCS determines who the user is for purposes of record keeping from the real UID rather than the effective UID (EUID). The main reason for this is that in some programming environments, some of the RCS commands may be set UID (SUID) to a unique owner that represents the software project's configuration management. In this way, the CM folks may be able to perform manipulations on the source code that regular programmers cannot. It also can be used to prohibit programmers from modifying the source code without using the RCS tools, which can be handy if the programmers have a tendency to cut corners. In this way, even though the "ci" command may be SUID to rcsowner, when files are checked in the UID of the person executing the command, not rcsowner, will be logged as having checked in that revision.

This has some consequences for use on system files. If I, logged in as npc, execute "su" to become root, my UID will still be npc while my EUID becomes that of root. This means that if my site uses loose locking, that is everyone just runs "ci -l" all the time, some other SA may get error messages that some file they need to modify is locked by npc after I've checked it in as root. Despite the fact that I've used su to become root, the lock is still made by my regular UID which is npc. In this case, my colleagues would have to use rcs -u to break the lock before they could properly edit the file and use RCS to save revisions.

There are two good ways to get around this. The first is to set the locking policy on the file in question to non-strict. This is done with the "rcs -U file" command. From this point on, the owner of the lock isn't important in determining who gets to edit the file. That is, despite the fact that the previous lock was made by the UID of npc, root with some other UID can run "ci -l passwd" without any impediment or even error message. This locking policy change can be undone with "rcs -L". The only significant downside of this method is that after the first time any file is checked in to RCS, one has to remember to perform the "rcs -U" command on that file, or these sorts of lock conflicts may occur down the road.

The second way to handle this is to always use "su -" to become root. With the "-" flag, a full login is simulated. In this way, the UID is changed to be that of root as well as the EUID. The downside of this is, first, that everyone has to remember to type "su -" all the time, and second, that all the revisions will be attributed to root. That is, since my UID is that of root, nobody will know that npc checked in a revision of a file unless I put that information in the log, which I should do. In general, I think the loose locking method is a marginally better way to go, but I've seen them both work.

The Most Common Error

There is one error in using RCS on system files that occurs more often than any other, and if it occurs too often, can more than offset the gains that using RCS provides. That error is forgetting to make sure a the file exists where it is supposed to be after a revision has been checked in to RCS. For example, all one has to do is type "ci passwd" when one means to type "ci -l passwd" and the passwd file disappears. This is why it must be mandatory to run "ls -l file" or something functionally equivalent after each use of the "ci" command. I like "ls -l" because it will also show me the permissions of the file. Whether the policy is to leave the file writeable or not, I can see that the permissions are proper by using this command.

It's been my experience that every SA at every site that uses RCS will do this at least once. If one gets in the habit of checking one's work, the damage will be minimized and the service quality will improve through the use of RCS.

There is one more thing that's worth noting before we go on to some examples of exactly how RCS can be used it in the real world. There is exactly one significant difference between typing "ci file;co -l file" and "ci -l file" and that is that in the first case, there exists a window of time in which the target file does not exist in the current directory. That is, the ci removes the file until the co is run to restore it. This window may be very short, but if any other process wants to access the file in the mean time, there could be trouble. When one runs "ci -l file", file always exists. In system administrative use, the ci command must always take either the -u or -l flag without exception so that system files do not disappear, even for a fraction of a second.

Examples

Let's assume for this first example that this site has decided to use strict locking, and that all attempts to use the su command to become root should take the form "su -". Let's also assume we want to place the /etc/passwd file under RCS for the first time and add a new user account. Also assume there is no shadow password scheme in use. Here's how it would go with bold text indicating what the user types:

% su -
Password:
# cd /etc
# mkdir RCS
# chmod 700 RCS
# ci -l passwd
RCS/passwd,v  <--  passwd 
enter description, terminated with single '.' or end of file:
NOTE: This is NOT the log message!
>> This is the passwd file.
>> npc
>> .
initial revision: 1.1
done
# ls -l passwd
-rw-r--r--  1 root  wheel  613 Jun 16  1998 passwd
# vipw
# ci -l passwd
RCS/passwd,v  <--  passwd
new revision: 1.2; previous revision: 1.1
enter log message, terminated with single '.' or end of file:
>> Added user fred.
>> npc
>> .
done
# ls -l
-rw-r--r--  1 root  wheel  663 Jun 22  1998 passwd
#

Pretty simple, eh? That's as tough as it gets, too, since we had to make the directory and do this initial check in.

Let's try another example. Assume that we're using loose locking this time (we don't have to do the "su -" stuff) and we don't leave locked files lying around in any case, that is, we run "ci -u file" after making changes, and then run "ci -l file" only for the amount of time during which we want to edit the files. Let's assume that while editing the hosts.equiv file one notices the following line within that file:

trouble.badguy.org	bozo

Okay, maybe we don't know who badguy.org is, and we don't remember a user named bozo on our system. Probably, the first thing we do is ask around to see if anyone knows what this is for, but let's suppose that nobody within earshot does. At the very least, if folks are using RCS properly, we can make a determination as to about when the addition of this account occurred, and perhaps who it was who added it.

The first thing we do is look at the file:

# ls -l hosts.equiv
-r--r--r--  1 root  wheel  124 Jun 10 18:45 hosts.equiv

Since it is not writeable, the current version has probably not been modified since it was last checked in, but we should verify this by checking in the current version:

# ci -l hosts.equiv
RCS/hosts.equiv,v  <--  hosts.equiv 
file is unchanged; reverting to previous revision 1.5
done

Because there is no warning or error message, this indicates that yes, indeed, the current working version has been checked into RCS.

Next, it's time to figure out about when the entry in question was added. So, we use rcsdiff, noting that the current version is 1.5. We can run down this list by hand , but a simple shell script and the use of grep will make the job easier:

# for VERSION in 4 3 2 1
> do
>	echo "Version 1.$VERSION"
>	rcsdiff -r1.5 -r1.$VERSION hosts.equiv | grep bozo
> done
Version 1.4
Version 1.3
< trouble.badguy.org	bozo
Version 1.2
< trouble.badguy.org	bozo
Version 1.1
< trouble.badguy.org	bozo
#

From this we now know that there was no mention of this account in version 1.3, but that there it was present in version 1.4. We know this because there is no difference involving the characters "bozo" between version 1.5 and version 1.4 of the file. Therefore, we want to see who checked in version 1.4 (right after the edit was made), and how find out much time elapsed between versions 1.3 and 1.4 in case it was added between checkins by someone not using RCS. We'd also like to know if any other changes were made between versions 1.3 and 1.4.

We can determine this information fairly easily:

# rcsdiff -r1.4 -r1.3 hosts.equiv
===================================================================
RCS file: RCS/tmp,v
retrieving revision 1.4
retrieving revision 1.3
diff -r1.4 -r1.3
3d1
< trouble.badguy.org	bozo
# rlog -r1.4

RCS file: RCS/hosts.equiv,v
Working file: hosts.equiv
head: 1.5
branch:
locks: 
access list:
symbolic names:
keyword substitution: kv
total revisions: 5;     selected revisions: 1
description:
----------------------------
revision 1.4
date: 1998/05/28 09:58:55;  author: fred;  state: Exp;  lines: +2 -1
Added temporary access for John Doe.
----------------------------
=============================================================================

This tells me that this entry was made by "fred" no later than May 28th, that this was the only change made to the file at that time, and that it was made so that the user John Doe (presumably the bozo we're looking for) could access the system. I know that I should talk to "fred" about this account, and if they don't remember making it, it may mean that somebody unauthorized has access to this system. If "fred" didn't make this addition, then we also may have some idea as to when this unauthorized access occurred, but since the RCS file is just plain text, a sophisticated intruder theoretically could have modified it to throw us off. In any case, because we're using RCS, we have a good idea who made changes, what they were, and when they occurred.

RCS Usage in Shell Scripts

All of the interactive properties of RCS can be performed non-interactively by the appropriate use of certain flags. For example for the "ci" or "co" command, the "-q" flag suppresses output, which can be useful in scripts; the "-m" flag can specify the log message to be used, although one should be careful about how one's shell escapes newlines if a multiline log message is to be included; and the "-t" flag can be used to set the descriptive text if the file is being checked in for the first time, otherwise it is ignored.

In all cases, the RCS commands issue different return codes depending on whether they were successful or not. As the shell programmer would expect, a return code of 0 indicates success while a return code of 1 indicates failure. Using this information, one can have a process wait until a lock can be obtained in the following manner:

#!/bin/sh


file="passwd"
flag=1

cd /etc
while [ "$flag" = "1" ]
do
	co -q -l $file > /dev/null 2>&1
	flag="$?"
	sleep 10
done

workonfile()
ci -q -u -m"New account added by automated process." $file

Obviously, scripts are less adaptable to error conditions than humans are, so some care needs to be taken in how these commands are implemented. I recommend testing the script by running it twice simultaneously to see if the final results are sensible. Also, test to see how well everything works if the script and a person are trying to simultaneously edit a test file. Finally, do things like deleting the test file and intentionally corrupting it to see if the output is sensible. With some careful experimentation, one will find that by properly using RCS, automated modifications to files are less error prone, and they are easier to fix on the off chance that something does go wrong.

Many sites use RCS from scripts to assist in such things as automated account addition, update of DNS zone files, modification of important data files, etc.. Many of these important files are periodically archived so that the state of the system at some previous time can be reconstructed to gain historical understanding, fix an error, or recover from some disastrous incident. This process can easily be two or more orders of magnitude more efficient using RCS than by using cp and compress on the same files.

More Information

There is a lot of extra information available. A good start is Walter Tichy's RCS paper, which is available on any FreeBSD system, as part of the BSD 4.4 documentation set, or at http://www.freebsd.org/doc/psd/13.rcs/paper.html. This paper is also available in nroff/troff format as part of the RCS source code distribution. Couple this paper with the man pages, and the prospective user should be off and running. The man pages on ci, co, rcs, and rcsintro should be read thoroughly, one may only need to skim the man pages on rlog, and rcsdiff at first.

I strongly recommend is the book Applying RCS and SCCS by Don Bolinger and Tan Bronson published by O'Reilly and Associates. This is an excellent explanation of RCS and SCCS written in an efficient manner with a focus on the programming environment. Fortunately, since the scope of their book is relatively narrow, they can really get into details and provide a large number of highly relevant examples. In my opinion, everyone who uses RCS or SCCS must have a copy of this book, and any programmer who uses version control in a multi-user environment, even if they are not using one of these packages, should read it as well.

RCS may already be installed on your system. Do a man rcs or which rcs to see if it's there and in your path. If one is running one of the Open Source Unices, like Linux or one of the BSDs, it is probably already there. If not, it can be retrieved from the Free Software Foundation anonymous FTP site as: ftp://aeneas.mit.edu/pub/gnu/rcs-5.7.tar.gz or at any of their mirrors. Installing it is usually as simple as unzipping it, untarring it, running configure and then make install.

Conclusion

In conclusion, there is really no good reason not to use version control on all of one's line oriented documents. In this article, I have demonstrated how RCS can be used by system administrators to assist in the maintenance of system configuration files, but the choice of which version control system one uses isn't all that important, the only really wrong decision is to not use one at all. I believe the user will find that if they just starts using version control tools on a regular basis, in no time their proper use will become second nature, and they will prove to be a valuable tool that will benefit all their computing endeavors.