University of Pretoria
Operational / Internal Site

Kerberos / AFS (Overview)

Kerberos provides centralised authentication for EECE departmental resources and we use AFS (Andrew File System) as our network file system. The particular implementations we use are MIT Kerberos and OpenAFS.

Kerberos is a network authentication protocol. It is designed to provide strong authentication for client/server applications by using secret-key cryptography. A free implementation of this protocol is available from the Massachusetts Institute of Technology (MIT). Kerberos is available in many commercial products as well (such as Microsoft Active Directory).

AFS was developed in the 80's by the Computer Science Department of Carnegie Mellon University (CMU). The technology was spun off into a company named Transarc. Transarc sold a commercial version of the software, and many other organisations, such as NASA, IBM, MIT and Stanford adopted AFS. Then, in the 90's, Transarc was acquired by IBM. In 1999, IBM decided to open the source of AFS, and so OpenAFS was born.

Operation

AFS differs from the other dominant Unix network filesystem, NFS, in some important ways:

Authentication

In contrast with NFS, which implicitly trusts that all clients are authenticated, OpenAFS uses Kerberos for authenticating users. This means that it's impossible to gain access to AFS files without a user password, even if it is possible to compromise the client OS. This feature makes AFS suitable for network file access on a client which is shared by both staff and students.

In order to access files on AFS, the user needs to - Obtain a Kerberos Ticket from the Kerberos realm with which the AFS cell is configured. This is done either by the login process (PAM module), or by running the kinit command. - Obtain an AFS token, using this ticket. This is done either by the login process (PAM module), or by running the aklog command. - If the files are not world-readable according to their AFS ACL entries (stored in AFS itself - this is not the Unix permissions!), then the user must have rights to read the file.

Caching

AFS also differs from NFS in that file data is cached on the client hard disk. This improves the overall performance of the file system, as once data is cached it can be accessed locally. It is said that an AFS server can handle 10 times more clients than NFS on the same hardware/network, because of the caching design.

The AFS client includes a cache manager, which maintains the consistency of the cache by sending writes back to the server. In order to ensure that cache reads are always valid, the server establishes a link to the cache so that if the file data changes on the server, it will send a message to each client which has a cached version of that data to let it know. This means that there are several ports used by AFS, and connections travel in both directions. Unfortunately this means that AFS does not work through restrictive firewalls.

Global namespace

All files in AFS are stored under the directory /afs. Each site which runs AFS is known as an AFS Cell. The cell roughly corresponds to the same site's Kerberos realm (authentication domain), and is usually a DNS domain name. Some common cells are:

  • /afs/athena.mit.edu
  • /afs/cs.stanford.edu
  • /afs/ee.up.ac.za

Mount points in AFS are directory objects where a unit called a volume is mounted. Volumes are stored on partitions which are in turn stored on AFS file servers. A partition is exactly as the name suggests, a logical chunk of some physical disk on the server. Volumes usually correspond to some logical grouping of files. A users home directory would for example be contained inside a volume which is specially created for that user and then mounted at some mount point under /afs/cellname/.

There is a specific difference between the way AFS and NFS mounting works. AFS volumes are mounted on a point in the AFS namespace at the server, which means that all afs clients' AFS paths are identical. User1 and User2 will never access the same file via different paths, due to the points at which their clients have mounted volumes (like NFS). AFS clients never mount anything, they just access files. Hence, AFS is said to have a Global Namespace.

Management

One of the design requirements for AFS was that it should reduce administration burden by allowing central management of distributed services. AFS provides a number of commands that allow the administrator to manipulate servers, volumes, mount points, and ACL's on files from any workstation or server on which the admin has logged on, and obtained the afs admin ticket/token.

Not only can any volume, mounted at any point in the AFS namespace be stored on any AFS file server in the cell, but these volumes can be moved, even while they are in use, between servers, and the user is unlikely to notice anything. Now administrators can perform maintenance on a production file server without causing downtime for users, by moving the volumes to temporary storage on another afs server.

AFS users also have the ability to adjust ACL permissions, create their own user groups, etc, without help from the super user.

Replication

AFS volumes can be replicated. This feature is meant (usually) to provide local copies of often-used read-only volumes for users at remote sites. Replicas can also be used to provide redundancy. The AFS client tries to make an intelligent decision when choosing which version of a volume to access files from. It's however biased toward read-only volumes.